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Preface 

This  thesis  represents  an  exploratory  investigation  of  the  Kabrisky 
modal  of  the  human  visual  system  with  psychological  correlates.  There 
are  many  psychological  theories  that  attempt  to  describe  perceptual 
processes;  however,  these  theories  tend  to  be  restrictive  in  scope  as 
well  as  detail.  It  is  not  fully  known  what  pattern  information  is 
necessary  for  human  pattern  identification  or  why  the  human  visual 
system  recognizes  some  patterns  more  readily  at  0°  and  90°  than  at  45°. 
Additionally,  it  is  not  known  how  the  human  visual  system  groups  dot 
patterns  into  meaningful  units  or  why  the  system  fails  with  simple 
geometric  illusions. 

This  investigation  is  concerned  with  these  perceptual  phenomena 
and  attempts  to  offer  a  unified  explanation  of  them  in  terms  of  the  low- 
pass  spatial  filtering  used  by  the  model. 

I  gratefully  acknowledge  the  stimulating  environment  as  well  as 
the  advice  and  encouragement  provided  by  Dr.  Matthew  Kabrisky  during 
this  research.  Also  acknowledged  are  the  critical  comments  provided 
by  Capt.  Joseph  Carl,  Sqn.  Ldr.  Tony  Gill,  Capt.  Larry  Goble,  and 
Mr.  Frank  Maher,  I  am  also  indebted  to  Dr.  Hans  Oestricher  and  Lt.  Fred 
Howard  for  permitting  me  to  use  their  laboratory  facilities'  as  well  as 
Tom  Herbert  for  providing  the  special  spatial  filters  and  Capt. 
Dennis  Quine  for  the  realization  of  most  of  the  photographs  appearing 
herein.  Additionally,  I  thank  the  subjects  who  volunteered  for  the 
psychological  experiments. 
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Abstract 


A  model  of  the  human  visual  system  is  investigated  for  psychologi¬ 
cal  correlates.  A  priori  hypothosos  from  tho  model  concerned  with 
human  identification  of  defocused  letters  as  well  as  identification  of 
rotated  letters  have  been  validated  with  the  computer  model.  Gestalt 
principles  of  similarity,  proximity,  closure,  and  figure-ground  per¬ 
ception  as  well  as  geometric  illusions  are  explained  in  terms  of  spatial 
filter  bandwidth  using  the  optics  homolog.  The  experimental  results 
have  allowed  postulates  which  extend  the  model  by  means  of  another 
cortical  transform  and  a  spatial-filter  shape  which  is  also  psychologi¬ 
cally  correlated.  It  is  further  postulated  that  the  human  perceptual 
space  is  the  image  domain  from  spatially  filtered  transforms  of  object 
forms.  It  is  concluded  that  the  model  provides  a  means  of  obtaining 
quantitative  psychological  correlates  of  the  human  visual  system. 
Recommendations  are  made  for  additional  investigations  concerning 
psychological  correlates. 
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PSYCHOLOGICAL  CORRELATES  OF  A  MODEL  OF 
THE  HUMAN  VISUAL  SYSTEM 


I.  Introduction 


Background 

The  way  that  the  cortex  processes  human  perceptual  information  is 
largely  unknown.  However,  there  have  been  numerous  models  proposed  to 
explain  possible  cortical  information  processing  of  the  human  visual 
system.  The  two  basic  approaches  that  have  been  used  to  model  nervous 
systems  will  be  termed  microscopic  and  macroscopic.  The  microscopic 
approach  is  concerned  with  the  modeling  of  neurons  into  nerve  nets  or 
neuromimes  to  determine  the  operating  principles  cf  nervous  systems, 
e.g.,  the  McCullock  and  Pitts  Model  (Ref  43:145-172).  The  macroscopic 
approach  is  concerned  with  the  basic  structure  of  subsystems  of  the 
nervous  system,  e.g.,  the  Kabri sky  Model  (Ref  26).  The  argument  against 
the  microscopic  approach  is  humorously  but  cogently  presented  by 
Kabrisky  in  "Birds  and  Feathers  -  Neurons  and  Nervous  Systems"  (Ref  25). 
Based  upon  known  anatomical  and  physiological  data  of  the  human  visual 
system,  Kabrisky  investigated  the  mathematics  that  the  infra-cortical 
structure  could  support.  The  validity  of  this  approach  is  suggested  by 
the  fact  that  it  worked.  Improvements  to  the  Kabrisky  model  by  Radoy 
(Ref  42),  Tallman  (Refs  45  and  46),  and  Gill  (Ref  15)  have  resulted  in 
a  pattern  recognition  algorithm  based  upon  an  adaptive  low-pass 
spatially  filtered  Fourier  transform  that  scales  and  classifies  alpha¬ 
numeric  and  geometric  patterns  with  more  than  95%  accuracy.  Addition¬ 
ally,  the  Tallman-Radoy  algorithm  has  been  the  basis  for  such  diverse 
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classification  problems  as  random  one-dimensional  waveforms,  metal- 
crystal  structure,  speech  recognition,  and  Chinese  character  recognition 
(Ref  16).  Furthermore,  the  model  has  been  extended  by  Blackford  to 
include  color  pattern  recognition  capabilities  with  Substantial  success 
(Ref  5). 

The  demonstrated  success  of  the  Kabrisky  model  embodied  by  the 
Tallman-Ra-'oy  computer  algorithm  has  led  to  a  second  look  at  the  model 
and  has  prompted  such  questions  asi  Is  the  Kabrisky  model  conceptually 
valid?  Can  it  provide  metrics  for  human  pattern  recognition?  Maher,  in 
an  initial  psychological  investigation  of  the  Kabrisky  model,  correlated 
the  computer  model  output  with  human  output  for  the  ranked  similarity 
of  animal  forms  (Ref  34).  The  resulting  correlation  coefficient  of 
0.961  supported  the  validity  of  the  model.  However,  the  model  has 
prompted  other  hypotheses  for  psychological  correlates.  Kabrisky  has 
suggested  that  cortical  retention  of  "blurred"  patterns  is  adequate  for 
human  identification  (Ref  28:137).  Additionally,  the  model's  failure 
to  classify  patterns  under  rotation  has  suggested  another  correlate  to 
human  performance.  This  paper  is  concerned  with  exploratory  investi¬ 
gations  of  these  two  hypotheses  with  the  computer  model.  During  this 
investigation,  evidence  for  correlates  with  the  Gestalt  principles  of 
perceptual  organization  and  geometric  illusions  emerged  and  was  pursued 
using  a  Fourier  optics  homolog.  Furthermore,  the  model  is  extended  with 
a  different  biologically  realizable  transform  and  spatial-filter  shape. 

Perhaps  the  greatest  information  concerning  the  perceptual  organi¬ 
zation  of  the  human  visual  system  has  come  from  the  psychologists. 

Their  contribution  in  psychophysics,  experimental  paradigms,  and  data 
analysis  has  greatly  advanced  knowledge  of  human  perception.  Whereas 
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the  psychologist  cannot  fully  describe  the  mechanism  for  human  per¬ 
ception,  his  procedures  and  observations  offer  significant  clues  to  the 
engineer  in  designing  and  validating  his  model.  Dedvvcll,  I n  his  hook 
on  visual  pattern  recognition,  provides  an  excellent  example  of  this 
approach  as  well  as  psychological  theories  of  pattern  recognition 
(Ref  11). 

The  major  problem  that  the  psychologists  as  well  as  the  engineers 
have  in  pattern  recognition  research  is  that  of  defining  a  feature 
space  in  which  like  patterns  or  pattern  elements  cluster.  Whereas  the 
engineer  is  free  to  use  any  feature  selection  scheme  that  works,  the 
psychologist  is  limited  to  metrics  that  must  be  correlated  to  human 
performance.  It  would  entail  a  lengthy  discussion  to  elaborate  on  the 
many  proposed  metrics  of  human  pattern  recognition;  therefore,  the 
reader  is  referred  to  an  article  by  Brown  and  Owen,  "The  Metrics  of 
Visual  Form"  (Ref  7).  It  is  sufficient  to  note  that  to  the  author's 
knowledge,  there  exists  no  feature  space  that  has  been  widely  accepted 
based  upon  known  metrics  of  visual  form.  A  priori  quantitative  pre¬ 
dictions  in  human  perceptual  research  are  still  the  exception  rather 
than  the  rule.  It  would  appear  that  a  valid  model  of  the  human  visual 
system  would  be  a  logical  source  for  developing  metrics  of  visual  form. 
This  paper  will  demonstrate  that  the  low-pass  spatially  filtered  trans¬ 
form  domain  may  provide  the  feature  space  required  to  quantify  metrics 
of  visual  form. 

Scope  and  Organization 

The  research  approach  reported  in  this  paper  has  been  that  of  a 
bioengineer  interested  in  investigating  a  mathematical  model  of  the 
human  visual  system  against  known  psychological  correlates.  This  is  an 
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admittedly  dangerous  procedure  due  to  the  disparity  of  interpretations 
concerning  psychological  experimental  results.  However,  there  are 
gross  perceptual  phenomenons  that  have  been  repeatedly  substantiated, 
e.g.,  human  perceptual  invariance  to  rotated  patterns  and  Gestalt 
principles  of  perceptual  organization.  These  perceptual  phenomena 
have  been  widely  studied,  but  the  visual  mechanism  that  produces  them 
is  not  fully  known.  It  is  these  gross  perceptual  phenomena  that  the 
author  believes  are  important  for  an  initial  validation  of  a  model  of 
the  human  visual  system.  Therefore,  rather  than  be  concerned  with  a 
definitive  psychological  experiment  to  accept  or  reject  a  model  (if 
there  is  one),  the  purpose  of  this  investigation  is  to  present  evidence 
from  disparate  psychological  experiments  that  when  examined  as  a  whole, 
will  substantiate  the  model.  This  research  is  on  an  exploratory  level 
and  hopefully  sets  the  groundwork  for  a  more  in-depth  investigation  of 
the  model.  This  paper  is  also  written  in  an  attempt  to  bridge  the  gap 
between  the  bioengineer  concerned  with  cortical  models  and  the  cogni¬ 
tive  psychologist.  With  this  in  mind,  detailed  mathematical  theory  is 
omitted  with  a  concomitant  sacrifice  of  rigor  for  a  more  heuristic 
development.  Adequate  references  are  made  for  the  reader  interested  in 
additional  detail. 

To  orient  the  reader,  the  next  section  contains  the  biological 
background  and  assumptions  as  well  as  the  mathematics  of  the  model,  com¬ 
puter  and  optics  simulation,  and  spatial  filtering.  The  following  four 
sections  are  concerned  with  the  experiments  of  the  model  against  psycho¬ 
logical  correlates:  recognition  of  defocused  letters,  recognition  of 
rotated  letters,  Gestalt  principles  of  perception,  and  geometric  illu¬ 
sions.  Each  of  these  sections  contains  a  brief  introduction  orienting 
the  reader  to  the  specific  objectives  of  each  experiment,  followed  by 


4 


GE/EE/71S-2 


methods,  results,  and  discussions.  The  experimental  results  and  psycho¬ 
logical  correlates  chat  allow  the  model  to  be  extended  in  terms  of 
spatial  filter  shape  are  then  presented,  fnllnwed  hy  an  +K«y 

suggest.  The  Kabrisky  model  is  then  extended  by  means  of  another  trans¬ 
form  as  well  as  correlated  with  a  contemporary  psychological  theory  of 
the  human  visual  system.  Finally,  the  paper  will  present  the  conclu¬ 
sions  and  recommendations.  Subject  data  and  computer  data  in  graph  as 
well  as  table  form  are  contained  in  the  appendices. 
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I I .  The  Kabrisky  Model  of  the  Human  VI sual  System 


I fit fuuuc cion 

This  section  contain?  an  overview  of  the  human  visual  system — 
the  infra-cortical  structure  and  the  mathematics  it  could  support. 

A  heuristic  approach  to  the  cortical  computations  will  lead  to  the  normal 
form  of  the  two-dimensional  discrete  Fourier  Transform.  Additionally, 
the  computer  simulation  of  the  model  as  well  as  the  optical  homolog  and 
spatial  filtering  are  discussed. 

Biological  Background 

In  modeling  any  complex  system,  certain  assumptions  must  be  made  if 
the  model  is  to  be  physically  realizable.  However,  the  use  of  assump¬ 
tions  is  not  necessarily  a  negative  action;  indeed,  they  may  highlight 
the  key  concepts  as  well  as  important  design  areas  for  future  develop¬ 
ments.  Therefore,  Kabrisky' s  model  of  the  human  visual  system  assumes 
monocular,  monochromatic,  non-temporal  as  well  as  exclusively  foveal 
vision.  Since  human  pattern  recognition  is  easily  performed  with  the 
preceding  constraints,  the  assumptions  are  conceptually  valid  and  do 
allow  a  computer  realizable  approximation  for  a  model  of  the  human  visual 
system. 

All  visual  systems  must  initially  encode  the  external  visual  field 
in  a  manner  compatible  to  subsequent  processing  mechanisms.  The  retina 
of  the  eye,  consisting  of  rods  and  cones,  provides  the  initial  encoding 
in  the  form  of  pulse-frequency  electrical  information.  However,  the 
retina  does  not  provide  uniform  visual  acuity.  Greatest  visual  acuity 
is  limited  to  the  fovea--the  retinal  region  containing  the  most  dense 
populations  of  cones.  The  physical  size  of  the  central  fovea  is 
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The  human  cortex  is  structurally  a  flat  thin  sheet,  constructed 
of  relatively  independent  small  columnar  units.  Kabrisky  calls  these 
columnar  units  "basic  computational  units"  or  BCE's  (Ref  2A«39).  Neuro¬ 
physiological  evidence  confirming  the  existence  or  BCcvs  in  cat  and 
monkey  cortex  has  coma  from  experiments  by  Huble  and  Wiesel  (Refs  23  and 
50).  Whereas  the  former  physiology  is  well  known,  the  details  of  the 
cortical  connectivity  must  be  conjectured.  A  glance  at  the  cortical 
neural  structure  clearly  reveals  the  problem.  The  dense  and  apparently 
random  interconnectivity  has  not  been  amenable  to  known  neurophysio¬ 
logical  mapping  techniques.  Therefore,  the  mapping  of  this  dense  inter¬ 
connectivity  to  different  cortical  areas  can  only  be  deduced  from  such 
experiments  as  that  performed  by  Dusser  de  Barenne  and  McCulloch  (Ref  12). 
When  a  pinhead  strichnine  spot  was  placed  in  the  primary  visual  cortex 
(area  17),  random  strichnine  spikes  appeared  in  the  association  visual 
cortex  (area  18).  Whereas  this  seemingly  random  transmission  from  one 
cortical  area  to  another  suggests  little  to  the  physiologist,  it  suggests 
a  mathematical  transform  correlate  to  the  engineer. 

Model  Developments  from  the  Biology 

The  preceding  biological  investigation  led  Kabrisky  to  conceptualize 
the  cortical  areas  17  and  18  as  densely  connected  two-dimensional  sheets 
as  illustrated  by  Fig.  2.  This  conceptualization  allowed  Kabrisky  to 
suggest  that  human  two-dimensional  pattern  recognition  could  be  based 
upon  cross-correlation  between  area  17  and  area  18.  However,  slight 
variations  in  the  ;input  pattern  easily  faults  a  pattern  recognition 
scheme  based  upon  cross-correlation.  Additionally,  a  cross-correlation 
scheme  requires  a  prohibitive  amount  of  storage  space  for  pattern  memory. 
To  overcome  these  problems,  Kabrisky  suggested  the  two-dimensional 
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Fourier  transform  as  a  possible  cortical  transform  (Ref  26(82).  Radoy 
demonstrated  the  validity  of  that  concept  In  classifying  simple  patterns 


(a) 


Kabrisky's  Proposed 
Cortical  Connectivity 
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(b) 

Mahaffey' s  Proposed 
Cortical  Connectivity 


Fig.  2 

The  Kabrisky  and  Mahaffey  Cortical  Schemes 
(Arrows  Indicate  Information  Flow  Direction) 


with  just  the  low  spatial  harmonics  of  the  Fourier  transform  by  cross- 
correlation  in  the  transform  domain  (Ref  41).  Tallman  expanded  Radoy' s 
work  and  developed  a  pattern  recognition  scheme  based  upon  low-pass 
adaptive  spatial  filtering  (Refs  46  and  47).  Gill,  using  a  phycho- 
logically  correlated  moment  technique,  extended  Tallman' s  algorithm  to 
include  pattern  scaling  (Ref  15). 

It  can  be  noted  that  the  cortical  scheme  of  the  Kabrisky  model 
could  support  other  mathematical  transforms.  This  observation  led  Carl 
to  demonstrate  that  the  sequency  domain  of  the  Walsh  transform  was 
qualitatively  equal  to  the  frequency  domain  of  the  Fourier  transform  in 
similar  pattern  recognition  tasks  (Ref  8).  Subsequent  investigations 
have  led  Kabrisky  and  Carl  to  report  a  class  of  densely  connected  trans¬ 
forms  having  unique  sequency/frequency  interpretations  that  are  amenable 
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to  low-pass  spatial  filtering  techniques  (Ref  29).  Based  upon  a  dif¬ 
ferent  interpretation  of  the  cortical  structure,  Mahaffey  has  reported 
a  less  densely  connected  transform, illustrated  by  Fig.  2b,  that  does 
a  Fourier-like  transform  on  the  surface  of  area  18  (Ref  33) <  It  is 
intuitively  satisfying  to  note  that  variations  of  mathematical  trans¬ 
forms  as  well  as  cortical  structure  have  yielded  qualitatively  similar 
results.  Further  research,  primarily  in  the  neurophysiological  area, 
will  be  required  to  determine  "the"  human  cortical  transform  or  trans¬ 
forms. 


Mathematics  of  the  Model :  the  Fourier  Transform 

Consider  a  visual  pattern  composed  of  grey  levels,  pulse-frequency 
coded  from  the  retina.  The  pulse-frequency  coding  can  be  considered  to 
be  a  two-dimensional  intensity  sampling  of  the  visual  pattern.  Without 
a  loss  of  generality,  let  the  sampled  pattern,  p(x,y),  be  a  two- 
dimensional  square  array  over  the  visual  field  coordinates  (X,Y).  Let 
t(u,v,x,y)  be  some  cortical  function  acting  at  a  square  BCE  array  (U,V) 
between  cortical  areas  17  and  18,  which  when  multiplied  by  p(x,y), 
transforms  p(x,y)  in  some  unique  way.  Thus,  the  transformed  visual 
field  at  area  18  may  be  expressed  as 

P(u,v)  =  £  ]T  p(x,y)  t(u,v,x,y)  (l) 

X  Y 

Let  the  original  visual  pattern  be  obtained  by  an  inverse  transform 
t  *(u,v)  such  that 

p(x,y)  =  ££  P(u,v)  t-1(u,v,x,y)  (2) 

U  V 

If  one  considers  this  inverse  transformation  to  occur  between  area  18 
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and  area  19,  then  the  original  pattern  is  now  considered  to  be  a  recon¬ 
structed  image  at  the  higher  cortical  level.  The  advantages  of  trans- 
forming  and  reconstructing  the  pattern  are  realized  when  the  transformed 
pattern  is  spatially  filtered  and  this  will  be  discussed  in  later  sec¬ 
tions.  Whereas  the  reconstructed  image  is  an  isometry  of  the  spatially 
filtered  transform  and  adds  no  additional  information  for  a  pattern 
recognition  algorithm,  it  does  illustrate  more  meaningfully  what  the 
spatial  filtering  accomplishes  with  regard  to  the  original  pattern 
features. 

Formalizing  the  preceding  concepts  in  terms  of  matrix  notation,  let 

[p]  =  pattern  matrix 

[t]  =  transform  matrix 

[p]  =  transformed  pattern  matrix 

where  each  matrix  is  an  NxN  square  array. 

Then  by  matrix  multiplication 

(>]  =  [T]  [p]  [T]T  (3) 

and  if  [t]  has  an  inverse,  then  the  inverse  transform  is 

[p]  -  W1  Cp]  It]'1  (4) 

If  [t]  is  based  upon  complex  exponentials  such  that 

.  [T]  =  .*p(- J  (s) 

then 

[T]"1  =  £  [T]*  (6) 
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where  [t]*  is  the  complex  conjugate  of  [t]  and  [t]  is  the  finite, 
discrete  two-dimensional  Fourier  Transform  of  [p]  (Ref  29). 

The  two-dimensional  finite,  discrete  Fourier  Transform  pair  is 
usually  expressed  as 

X  Y 

P(u,v)  =  7  £  p(x,y)  exp[- j2tt(—  +  ^)]  (7) 

x=l  y=l 

U  V 

p(x,y)  =  Y,  E  p(u»v)  exp[j2Jt(-y  +  ^)]  (8) 

u=-U  v=-V 

where  X  =  2U  +  1,  Y  =  2V  +  1  and  2nxu/x,  2rcyv/Y  are  the  spatial 
frequencies  (cycles  per  unit  length)  in  the  Fourier  Transform  plane.  The 
properties  of  the  Fourier  Transform  pair  are  fully  discussed  in  Ref  46. 
Let 

z  =  2>t(y  +  «>  (9) 

Then,  Eq  (7)  becomes 

X  Y 

p ( u , v)  =  E  E  exp("Jz)  (10) 

x=l  y=l 

Using  Euler's  Identity,  Eq  (10)  becomes 

X  Y 

p(u,v)  =  E  E  p(x*y)(cos  *  -  i  sir  ■> 

x=l  y=l 

X  Y  X  Y 

=  l  Ip<».y)  cos  z  -  j  E  E  p(x»v^  sin  z  (H) 

x=l  y=l  x=l  y=l 

Thus,  the  Fourier  Transform  of  an  intensity  pattern  p(x,y)  can  be 
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considered  to  be  the  summation  of  discrete  points  obtained  from  sine  and 
cosine  grids  of  varying  frequency  and ■ orientations  superimposed  on  the 
pattern  p(x,y),  i.e.,  P(u,v)  is  a  weighted  sum  of  p(x,y).  me 
specific  weighting  cf  p(x,y)  is  determined  by  the  transform.  Kabrisky 
and  Carl  (Ref  29)  have  suggested  that  transforms  need  only  two  properties 
for  useful  image  processing i  l)  unique  sequency/f requency  interpreta¬ 
tion  and  2)  dense  connectivity  whereby  energy  is  localized,  i.e., 
relatively  few  spectral  components  are  large.  Thus,  the  weighting 
process  is  indeed  very  special.  In  the  case  of  the  Fourier  Transform, 
the  pattern  p(x,y)  is  decomposed  into  a  two-dimensional  spectrum 
whereby  each  spectral  component  corresponds  to  the  amount  of  energy 
contained  in  the  pattern  components.  Furthermore,  the  energy  from 
larger  pattern  features  is  localized  in  the  lower  spectral  components 
(low  spatial  frequencies),  whereas  the  energy  from  the  smaller  pattern 
features  is  found  mainly  in  the  higher  spectral  components  (high  spatial 
frequencies).  This  concept  is  illustrated  in  a  later  section. 

The  Computer  Model 

The  Tallman-Radoy  computer  algorithm  utilized  the  preceding  two- 
dimensional  Fourier  Transform  to  classify  patterns  using  a  linear 
decision  function.  The  algorithm  transforms  patterns  into  the  low-pass 
spatially  filtered  frequency  domain  and  uses  the  minimum  Euclidean  dis¬ 
tance  between  transformed  patterns  and  stored  prototypes  to  classify 
like  patterns. 

Tallman  suggests  to  "visualize  each  pattern  class  as  a  many-faceted 
cloud  in  the  image  function  space,  whose  faces  are  the  linear  hyperplanes 
corresponding  to  the  Euclidean  distance  between  prototypes  of  each  class" 
(Ref  48:41) .  Since  minimizing  Euclidean  distance  maximizes  correiai:  n, 
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the  Tallman-Radoy  algorithm  could  also  be  considered  as  the  cross¬ 
correlation  of  a  spatially  filtered  pattern  transform  image  with  a 

nrntntvno  \m Arm. 

-  '  *  •  **  —  ■  *  -tt  -  -v 

The  computer  implementation  of  the  Tallman-Radoy  algorithm  is 
illustrated  by  the  flow  diagram  of  Fig.  3  on  page  15.  Preprocessing 
consists  of  digitizing  a  pattern  transparency  onto  magnetic  tape  by  a 
PDP-1  computer-controlled  Litton  flying  spot  scanner.  The  digitized 
pattern  is  then  processed  by  the  Tallman-Radoy  algorithm  by  an  IBM  7094 
digital  computer.  Briefly,  the  pattern  is  read-in,  centered,  energy 
normalized,  and  then  scaled  using  the  Gill  scaling  algorithm.  The  next 
procedures  are  to  compute  the  pattern  transform,  spatial  filter  and  dc 
normalize  to  correct  for  any  scaling  changes.  The  spatially  filtered 
transform  is  now  classified  to  a  stored  prototype.  A  correct  pattern 
classification  is  acknowledged  by  computer  print-out,  whereas  the  pro¬ 
totype  is  updated  if  an  incorrect  classification  occurs. 

It  is  interesting  to  note  that  the  centering  process  and  the  nor¬ 
malization  process  may  be  likened  to  human  foveal  centering  and  pupil¬ 
lary  expansion,  respectively. 

The  Optical  Homolog 

The  optical  processing  model  described  herein  is  a  homolog  to  the 
Kabrisky  model.  The  optics  fiber  homolog  that  Radoy  has  suggested  to 
mimic  the  visual  pathways  can  also  be  achieved  by  collimated  light 
(Ref  42*11-13).  The  following  theoretical  development  parallels  that 
by  Pratt  and  Andrews  (Ref  39). 

Consider  a  coherent  optical  system  illustrated  by  Fig.  4  on  page  16. 
An  object  (pattern)  in  the  xy- plane  with  transmittance  p(x,y)  is 
illuminated  by  a  coherent,  collimated  light  beam  normally  provided  by  a 
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laser.  At  the  input  plane  the  electric  field  amplitude  of  the  light  is 
proportional  to  p(x,y).  The  first  spherical  lens  produces  an  image  of 
the  object  in  the  focal  plane.  The  Kirchoff  integral  of  diffraction 
theory  describes  the  light  electric  field  amplitude  P(u,v)  in  the 
filter  plane  and  is  given  by  the  Fourier  Transform  equation 

P(u,v)  =  f  j{p(x,y)  exp[-J2it(-y-^yCi)]|  dxdy  (12) 

XoYo 

where  \  is  the  wavelength  of  the  coherent  light  beam  and  f  is  the 
focal  length  of  the  lens.  The  spatial  frequencies  in  the  Fourier  Trans¬ 
form  plane  are  given  by  2nxu/Xf,  2i:yv/\f  and  are  measured  in  terms 
of  cycles  per  unit  length.  An  optically  produced  Fourier  Transform  as 
well  as  a  three-dimensional  amplitude  graph  of  a  Fourier  transformed 
square  is  illustrated  by  Fig.  5  on  page  18.  A  second  spherical  lens, 
inserted  as  in  Fig.  4,  will  produce  a  second  Fourier  Transform  that 
will  reconstruct  the  image  at  the  output  plane. 

Modification  of  the  reconstructed  image  can  be  obtained  by 
inserting  a  filter  with  a  transmittance  function  H(u,v)  at  the  filter 
plane.  The  filter  will  modify  the  amplitude  and  phase  of  the  light, 
thus  modifying  the  spatial  frequencies  of  the  transformed  pattern.  As 
Pratt  and  Andrews  point  out,  a  major  problem  with  conventional  optical 
processes  is  the  construction  of  filters  with  different  transmittances 
(Ref  40:13).  However,  with  computer  simulation  amplitude  and  phase 
functions  of  a  filter  are  easily  represented.  Therefore,  whereas  the 
optical  processing  allows  immediate  viewing  of  a  reconstructed  image, 
precise  control  of  the  filtering  process  can  be  most  easily  attained  by 
a  computer  simulation. 
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The  optical  system  used  in  this  paper  will  bo  discussed  in  a  later 
section. 

Spatial  Fj lterlnq 

The  spatial  filtering  techniques  that  have  been  eluded  to  in  the 
previous  sections  are  now  discussed.  It  is  the  low-pass  spatial  fil¬ 
tering  of  the  transformed  pattern  that  has  been  mainly  responsible  for 
the  success  of  the  Tallman-Radoy  algorithm  in  classifying  patterns.  The 
power  of  spatial  filtering  in  the  field  of  optics  has  been  demonstrated 
by  the  Able-Porter  experiments  (Ref  18 : 143-146),  O'Neill  (Ref  37), 

Pratt  and  Andrews  (Ref  40),  and  most  recently  by  Lendaris  and  Stanley 
(Ref  31). 

In  terms  of  pattern  classification,  filtering  may  be  generally 
defined  as  a  process  by  which  features  relevant  to  the  pattern  class  are 
separated  from  irrelevant  features.  For  example,  seriphs  on  any  letter 
are  an  irrelevant  feature  with  regard  to  letter  identification.  Low- 
pass  spatial  filtering  of  a  Fourier  transformed  pattern  does  precisely 
this  type  of  feature  separation. 

Spatial  filtering  is  accomplished  most  easily  in  the  frequency 
domain  by  describing  the  filter  as  a  two-valued  impulse  response  functions 

(1  at  some  points  (u,v) 

' 

0  elsewhere 

Therefore,  a  reconstructed  image  of  a  spatially  filtered  transform  is 
given  by: 

pr(x,y)  =  p{P(u,v)  H(u,v)}  (13) 

which  is  a  simple  multiplicative  operation  in  the  frequency  domain.  Low- 
pass  spatial  filtering  of  the  Fourier  transform  is  accomplished  by 


19 


GE/EE/71S-2 


letting  H(u,v)  =1  at  the  center  portions  of  the  transform  and 
H(u,v)  =  0  for  the  rest  of  the  transform. 

In  terms  of  the  biological  mud-.-l,  assume  a  Fourier-like  transform 
at  cortical  area  18.  Spatial  filtering  can  be  conceptualized  to  occur 
by  "pre-wired"  cortical  connections  from  area  18  to  area  19  that  trans¬ 
mit  just  parts  of  the  transform.  Additionally,  a  selective  attention 
mechanism  at  a  higher  cortical  level  might  learn  to  "pay  attention'  to" 
different  parts  of  the  transform  at  area  18.  Validity  for  concepts 
such  as  these  awaits  further  experimentation. 

The  Tallman-Radoy  algorithm  accomplishes  low-pass  spatial  fil¬ 
tering  by  computing  only  ths  lower  spectral  components  of  the  transform. 
The  computer  model  filter  sizes  are  designated  as  3*3,  5*5,  7*7,  9*9 
which  pass  the  dc  and  first,  the  dc  through  second,  the  dc  through 
third,  and  the  dc  through  fourth  spatial  harmonics,  respectively,  as 
illustrated  by  Fig.  6,  page  21. 

Low-pass  spatial  filtering  of  an  optically  produced  Fourier  Trans¬ 
form  may  be  achieved  by  putting  a  mask  filter  in  the  focal  plane  where 
the  transform  occurs.  The  mask  filter  is  simply  a  transparent  portion 
of  an  opaque  material  which  blocks  the  higher  spatial  frequencies  from 
contributing  to  the  reconstructed  image.  The  mask  filters  are  measured 
in  microns  (n)  and  for  the  half-inch  patterns  presented  in  this  paper, 
a  100-jx,  200-n,  300-p  mask  filter  passed  the  dc  and  first,  the  dc  through 
third,  the  dc  through  fifth  spatial  harmonics,  respectively.  The  effects 
of  spatial  filtering  on  the  letter  E  are  illustrated  using  optical 
techniques  by  Fig.  7,  page  21.  Consider  the  tremendous  amount  of  redun¬ 
dant  information  that  this  pattern  has  in  terms  of  spatial  harmonics. 

The  original  letter  has  theoretically  infinite  spatial  harmonics;  never¬ 
theless,  only  the  dc  and  first  harmonic  are  required  for  human  as  well  as 
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machine  identification.  Later  experiments  will  demonstrate  even  more 
convincingly  the  power  of  low-pass  spatial  filtering  in  regard  to  the 
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This  concludes  the  introduction  to  the  Kabrisky  model  and  tech¬ 
niques  used  in  the  forthcoming  experiments.  However,  subsequent  experi 
nwanl-Al  reoiilta  will  ru»r»«a1  tate  a  return  to  spatial  filtering  theory. 
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III.  Identification  of  Defocused  Letters  Experiment 

Introduction 

Tallman  demonstrated  that  low-pass  spatially  filtered  alphanumeric 
and  geometric  patterns  could  be  correctly  classified  with  a  high  degree 
of  accuracy.  This  led  to  the  hypothesis  that  human  cortical  retention 
of  ]ow-spatial  frequency  (blurred)  patterns  would  also  be  adequate  for 
identification  (Ref  28:137).  It  is  difficult  to  accurately  correlate 
this  hypothesis  with  previous  psychological  experimentation  concerning 
defocused  stimuli,  due  to  the  diverse  defocusing  schemes  and  different 
stimuli  that  have  been  used.  However,  this  hypothesis  does  seem  in 
opposition  to  statements  such  as  Fredericksen* s:  "...  when  the  stimulus 
is  obscured  and  made  ambiguous  by  being  thrown  out  of  focus,  visual  recog¬ 
nition  is  impaired"  (Ref  14:1).  That  position  implies  the  necessity 
of  the  high  spatial  frequencies  that  the  model  implies  is  redundant  for 
pattern  recognition. 

This  hypothesis  was  investigated  by  a  learning  task  which  required 
both  long  and  short  term  memory  and  was  based  upon  the  span  of  percep¬ 
tion  method.  Two  subject  groups  were  required  to  observe  and  report 
letter  arrays--one  group  with  normal  letters  and  the  other  with  defocused 
letters--under  controlled  conditions. 

The  span  of  perception  method  is  perhaps  one  of  the  oldest  psychol¬ 
ogical  experimental  methods.  It  requires  the  subject  to  report  briefly 
presented  stimulus  containing  multiple  stimulus  objects  (Ref  45:902). 
Sperling,  in  investigating  short  term  memory,  demonstrated  that  the  sub¬ 
jects'  report  of  capital  English  letter  arrays  exposed  for  50  msec,  is 
invariant  to  letter  arrays  of  over  five  letters  and  the  spatial 
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arrangement  of  the  letters  (Ref  4:204).  He  summarized  the  experiment 
by  stating!  "When  the  subjects  are  asked  to  report  all  the  letters  of  a 

at  1  fHij Iijft  ^  1-Kfly  ran  r>+  a  n  1  y  a Ka ij+  f  <  wa  OH  +  SVS?SC|S,(  4j 

204).  These  results  formed  the  basis  of  the  learning  task.  The 
rational  behind  the  experimental  paradigm  is  that  the  high  spatial  fre¬ 
quencies  missing  from  the  de focused  letters  are  redundant  and  should  not 
detract  from  the  letter  identification  and  subsequent  learning.  Thus, 
implicit  in  the  letter  array  learning  is  an  investigation  of  the  invari¬ 
ance  of  stimulus  identification  by  means  of  different  spatial  frequency 
content.  Consistent  with  psychological  research  methods,  the  null 
hypothesis  is  that  there  is  no  relationship  between  the  mean  learning 
performance  with  letter  stimuli  of  high  and  low  spatial  frequency  content. 

Experimental  Method 

The  subject  population  consisted  of  10  males — nine  graduate 
engineering  students  and  one  faculty  professor,  each  having  reported 
corrected  or  uncorrected  20/20  defect-free  vision.  Initially,  five 
subjects  were  exposed  to  normal  stimuli  and  five  were  exposed  to 
defocused  stimuli.  The  stimulus  set  consisted  of  26,  six-letter  arrays 
(3x2)  of  capital,  block,  consonant  English  letters.  The  normal  stimulus 
was  0.5-in.  black  letters  on  a  white  card  which  filled  a  1.38  in.  wide, 
1.50  in.  high  array  as  Fig.  8a  illustrates.  The  defocused  stimulus  was 
obtained  by  photographing  optically  defocused  letters  to  approximately 
the  fourth  spatial  harmonic.  The  fourth  spatial  harmonic  was  determined 
by  defocusing  a  0.5  in.  letter  V  until  a  0.125  in.  cross  section  of  the 
V  could  not  be  resolved.  (The  availability  of  the  optics  equipment  that 
was  used  to  obtain  the  reconstructed  images  reported  in  this  paper  was 
not  known  at  this  time.)  Thus,  the  defocused  stimulus  Was  approximately 
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0.5  in.  block  letters  on  a  transparent  film  which  filled  a  1.19  in. 
wide,  1.31  in.  high  array.  The  defocused  stimulus  was  presented  on  a 
white  card  background  as  hig.  ab  illustrates. 


Example  of  Normal  and  Defocused  Letter  Arrays  Used  for 
The  Identification  of  Defocused  Letters  Experiment 

The  stimulus  arrays  were  exposed  in  the  Scientific  Prototype  (Model 
GB)  three-channel  tachistiscope  shown. in  Fig.  9.  The  tachistiscope 
allowed  precise  control  over  stimulus  exposure  time  (±  Q%)  and  illumina¬ 
tion.  The  luminance  of  the  three  tachistiscope  channels  was  set  at  a 
mid-scale  range  of  approximately  9  mL.  throughout  the  experiment.  Each 
subject  received  verbal  instructions  to  give  a  forced  whole  response  to 
the  stimulus  arrays  by  writing  the  letters  he  thought  he  saw  during 
exposure  in  pre-printed  3X2  block  arrays.  The  subject  was  instructed  to 
focus  on  a  black  cross  and  then  self-initiate  the  stimulus  presentation 
by  pressing  a  button.  This  initiated  the  stimulus  presentation  in  the 
following  paradigm: 
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1.  white  field  pre-exposure  with  focusing  cross  for  0.5  sec. 

2.  stimulus  array  exposure  for  50  msec. 

3.  whi+»  field  post-exposure 

The  stimulus  array  was  then  changed  as  the  subject  wrote  his  response. 


Fig.  9 

Three-Channel  Tachisti scope 


Each  subject  was  exposed  to  the  same  stimulus  array  sequence  twice 
for  three  trials  (Case  l).  Three  subjects  from  each  group  were  exposed 
to  the  same  stimulus  arrays  twice  for  five  trials  (Case  2).  The  stimulus 
arrays  for  each  group  were  then  switched--defocused  for  normal  and  normal 
for  defocused — for  four  more  trials  (Case  3)  resulting  in  a  total  of  20 
trials. 

Results 

A  correct  subject  response  was  considered  to  be  a  letter  written  in 
the  correct  array  position  with  respect  to  the  stimulus  array.  The  mean 
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number  of  correct  letters  from  all  the  subjects  in  each  stimulus  group 


Fig.  10 

Chart  of  Mean  Number  of  Normal  and  Dsfocused  Letters 
Identified  Correct  per  Trial 


less  than  one  standard  deviation  of  each  group  mean  per  trial,  and  the 
mean  standard  deviation  for  the  20  trials  for  the  normal  and  defocused 
groups  is  0.5552  and  0.4004,  respectively.  A  t-test  for  the  mean  of 
two  independent  samples  was  computed  to  evaluate  the  mean  performance 
of  the  two  groups  (Ref  43:165-169).  The  results  are  shown  in  Table  I. 
Since  the  t  statistics  are  much  less  than  the  significant  values  at  the 
0.2  level,  it  is  concluded  that  there  is  no  significant  difference 
between  the  identification  of  the  normal  and  defocused  letters  in  this 
experiment  and  the  null  hypothesis  is  rejected.  These  results  support 
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Table  I 

t-Test  Results  for  Mean  Letters  Identified  Correctly 
for  Normal  and  Defocused  Letter  Groups 


Trials  1-6 

Trials 

1-20 

Stimuli 

Normal  Defocused 

Normal 

Defocused 

Sample  Means 

4.2607  4.3230 

4.8773 

4.8427 

Sample  Estimates 
of  the  Standard 

Devi ations 

0.4486  0.2348 

0.5522 

0.4004 

Degrees  of  Freedom 

10 

38 

t  Statistic 

0.3015 

(p>  0.01  =  3.17) 

0.2269 

(p>  0.01  =  2.71) 

the  validity  of  the  initial  hypothesis  concerning  the  adequacy  of  corti¬ 
cal  retention  of  blurred  (low  spatial  frequency)  patterns  for  pattern 
identification. 

Discussion 

These  results  should,  perhaps,  be  qualified.  To  control  the  spatial 
frequency  content  of  the  defocused  stimuli  was  difficult  because  of  the 
photographic  process  used  to  generate  them.  However,  comparison  of  the 
defocused  letters  with  reconstructed  image  letters  containing  only  the  dc 
and  first  spatial  .harmonic  obtained  by  optical  Fourier  transform  tech¬ 
niques  shows  no  significant  differences  as  the  letters  "K"  in  Fig.  11 
illustrates.  This  appears  to  be  a  good  time  to  emphasize  the  difference 
between  defocused  stimuli  and  spatially  filtered  stimuli.  It  should  be 
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noted  that  defocuslng  smears  the  higher  spatial  frequencies  throughout 
the  image  plane,  whereas  low-pass  spatial  filtering  of  a  transform 


Normal 


Low-Pass  Spatial 
Filtered  to  dc 
and  First  Harmonic 


Opt’  ..'ill  y  I)-:,"-  uv-  i 
to  i'.ttjrtii  S| '.it. 'll 
Harmon i r 


Fig.  11 

Comparison  of  Defocused  and  Low-Pass 
Spatially  Filtered  Letter  K 


blocks  the  higher  spatial  frequencies  from  appearing  in  the  reconstructed 
image.  Other  properties  of  the  spatial  filtering  technique  over 
defocuslng  will  become  apparent  In  more  appropriate  sections.  Additional 
qualifications  concerning  this  experiment  include  a  single  non- 
representative  subject  population  and  only  one  class  of  stimuli. 

The  experimental  paradigm  was  devised  to  be  consistent  with  the 
biological  assumptions  of  the  model.  The  tachistiscope  viewing  distance 
of  the  letter  arrays  was  42  in.  Therefore,  the  visual  angles  of  the 
normal  and  defocused  letter  arrays  were  1°  53'  and  1°47',  respectively, 
which  are  well  under  the  3°  20'  parafoveal  region  reported  by  Polyak 
(Ref  38).  This  is  consistent  with  the  foveal  assumptions  of  the  model. 
The  stimulus  of  50  msec,  is  four  times  less  than  the  minimum  200  msec, 
required  for  the  voluntary  eye  scans  (Ref  30*17).  Therefore,  the  sub¬ 
jects  were  not  able  to  "read"  the  letters  individually  but  were  forced 
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to  input  the  letter  arrays  as  a  "whole."  This  point  is  important 
because  it  demonstrates  that  a  feature  selection  process  for  human 
pattern  identification  can  operate  at  a  human  cortical  level  that  does 
not  necessarily  require  the  dissection  of  pattern  features  by  eye  scans 
for  subsequent  identification.  Once  again,  this  concept  is  compatible 
with  the  temporal  assumptions  of  the  model. 

The  results  of  this  experiment  tend  to  support  the  Gestalt  concept 
that  contour  provides  basic  visual  information  (Ref  30tl3).  Another 
look  at  the  stimulus  arrays  reveals  that  the  characteristics  of  the 
normal  and  defocused  letters  obviate  letter  recognition  based  upon 
pattern  template  matching  or  edge  detection.  If  contour  were  not  the 
major  factor  in  identifying  the  letters  in  the  arrays,  then  one  would 
expect  that  the  detail  and  edge  contrast  difference  between  the  two 
stimuli  would  have  resulted  in  different  learning  curves.  However, 
this  was  not  the  result.  Therefore,  it  is  suggested  that  the  high 
spatial  frequencies  that  do  affect  detail  and  edge  contrast  are 
redundant  information  for  letter  identification,  whereas  the  low 
spatial  frequencies  that  provide  contour  contain  the  necessary  informa¬ 
tion  for  letter  identification. 

It  is  concluded  that  these  results  tend  to  validate  the  model  and 
further  investigation  of  pattern  identification  via  spatial  frequency 
content  is  warranted.  Furthermore,  it  is  suggested  that  stimulus  con¬ 
tent  in  terms  of  spatial  frequencies  would  provide  a  more  meaningful 
metric  that  appears  lacking  in  previously  reported  psychological  studies 
of  ambiguous  stimuli. 
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IV.  Letter  Identification  under  Rotation  Experiment 

Introduction 

Attneave  has  pointed  out  that  an  adequate  theory  for  human  form 
perception  should  not  only  take  into  account  the  successful  perceptual 
tasks  humans  perform,  but  it  should  also  account  for  tasks  under  which 
humans  fail  (Ref  2«66).  Specifically  then,  a  valid  model  of  the  human 
visual  system  should  provide  a  metric  for  invariance  of  form  perception. 
One  method  for  investigating  this  metric  is  pattern  rotation.  Human 
variance  in  recognizing  rotated  forms  is  well  known  and  has  been  widely 
studied  (Ref  ll).  Since  the  model  also  fails  in  recognizing  rotated 
patterns,  Kabrisky  hypothesized  that  the  invariance  of  pattern  identifi¬ 
cation  could  be  quantitatively  described  by  the  cross-correlation  of 
the  rotated  low-pass  spatially  filtered  pattern  transforms  with  a 
reference  transform  (personal  communication).  The  invariance  of  rota¬ 
tion  could  be  measured  by  the  minimum  variance  of  the  Euclidean  dis¬ 
tance  between  a  reference  and  rotated  pattern  transform.  It  will  be 
illustrated  later  that  the  circular  letter  0  has  low  variance,  whereas 
the  angular  letter  Z  has  high  variance. 

The  human  visual  system  is  spatially  polarized  with  recognition 
thresholds  for  lines  heing  dependent  upon  their  vertical,  horizontal,  or 
diagonal  orientation  (Ref  30i29).  Hake  has  summarized  twc  recognition 
under  orientation  experiments--Fitts,  et  al.,  with  a  study  of  the  Ohio 
State  metric  figures  and  Henle  with  a  study  of  partially  complete  non¬ 
sense  figures  and  alphabet  letters  (Ref  20s71-72).  The  Fitts  study 
reported  recognition  performance  of  vertically  oriented  bilaterally  sym¬ 
metrical  figures  being  superior  to  horizontally  oriented  figures. 
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Henle's  results  strikingly  demonstrate  that  the  recognition  of  letters 
is  greatly  dependent  upon  orientation.  Thus,  the  invariance  of  rota¬ 
tion  was  investigated  bv  correlating  human  identification  errors  of  a 
rotated  letter  with  Euclidean  distance  between  the  low-pas9  spatially 
filtered  transform  of  the  rotated  letter  obtained  by  means  of  the  com¬ 
puter  model.  The  rationale  is  that  letters  with  high  Euclidean  variance 
under  rotation  should  correlate  with  high  human  identification  errors. 

The  null  hypothesis  is  that  there  is  no  correlation  of  letter  rotation 
invariance  between  these  two  modal ites. 

Experimental  Method 

The  subject  population  consisted  of  five  naive  right-handed  male 
graduate  engineering  students  with  uncorrected  20/20  defect-free  vision. 
The  stimulus  set  was  a  26-letter  English  alphabet  of  black,  unfiltered, 
0.5  in.  high,  0.375  in.  wide  capital  block  letters,  each  on  a  white 
card.  Photograph  negatives  of  tne  letters  were  used  as  the  computer 
stimulus.  Each  of  the  five  subjects  was  evaluated  for  three  trials, 
each  trial  having  a  different  random  arrangement  of  the  26  letters, 
providing  a  total  of  15  trials. 

The  letters  were  exposed  individually  in  a  Scientific  Prototype 
(Model  GB)  three-channel  tachistiscope  at  angles  of  0,  30,  6u,  90,  120, 
and  150  degrees,.  The  illumination  of  the  tachistiscope  channels  was 
set  at  9  mL.  and  remained  constant  throughout  the  experiment. 

Each  subject  received  verbal  instructions  that  he  would  be  shown 
any  capital  block -English  letter  at  any  angle  for  a  brief  period  of 
time  and  that  he  was  to  report  verbally  the  letter  that  he  thought  was 
being  presented.  Upon  request,  the  subject  focused  within  a  black  circle 
and  self-i  'itiated  the  letter  exposure  by  pressing  a  button. 
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An  identification  threshold  for  each  subject  at  each  trial  was 
obtained  by  the  method  of  limits.  •  The  presentation  time  was  increased 
in  0.1  msec,  increments  from  1.5  msec,  until  the  subject  could  correctly 
identify  a  letter  presentation  (G,  R,  B,  for  trials  1,  2,  3,  respectively). 
The  identification  threshold  time  remained  constant  during  each  26-letter 
trial.  The  average  identification  threshold  time  for  the  15  subjects  was 
3.6  msec . 

A  subject  report  was  required  for  each  letter  at  each  of  the  six 
angles.  The  letters  were  rotated  counter-clockwise  (with  respect  to 
the  subject)  from  150°  to  0°  in  30°  increments  using  rotation  rig 
mounted  on  the  rear  of  the  tachistiscope.  A  white  card  containing 
random  block  lines  was  exposed  to  the  subject  of  the  0°  letter  presen¬ 
tation  and  a  subject  report  was  also  required.  The  subjects  usually 
guessed  a  letter  for  that  presentation.  This  allowed  a  letter  change  to 
be  made  between  exposures  without  disrupting  the  presentation  rhythm. 

When  each  subject  was  queried  as  to  the  experimental  paradigm  after  each 
trial,  none  reported  a  sequential  rotation  of  exposed  letters.  This 
technique  was  used  in  lieu  of  a  counter-balancing  technique  to  obtain 
maximum  data  from  a  minimum  number  of  available  subjects.  In  addition, 
a  counter-balancing  technique  was  not  used  because  the  learning  during 
the  three  trials  was  considered  to  be  minimal  and  may  have  distorted 
the  sensitive  identification  error  differences  between  adjacent  angular 
positions. 

The  Euclidean  distance  between  each  letter,  referenced  to  itself 
at  0°,  was  obtained  by  using  a  modified  Tallman-Radoy  computer  program 
using  a  Cooley-Tukey  Fast  Fourier  Transform  (Ref  9).  Since  the  shape  and 
size  of  the  spatial  filter  would  have  a  great  effect  upon  the  Euclidean 
distance  values,  four  low-pass  circle  and  square  spatial  filters  were 
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used:  3X3,  5*5 ,  7x7,  and  9*9.  These  are  the  same  filter  sizes  that 
have  been  used  for  previous  pattern  classification  schemes  and  by  Maher's 

r  -..-W-  1  „  *  1  -  ~  1  „  4.^ 

Results 

The  total  subject  identification  errors  are  presented  in  Table  II  on 
the  following  page.  The  values  of  Euclidean  distance  for  each  letter 
undergoing  rotation  for  each  filter  shape  and  filter  size  is  given  in 
Appendix  C.  A  one-way  analysis  of  variance  was  computed  to  determine 
the  homogeneity  of  the  subject  identification  errors  with  respect  to 
treatment  (letter  orientation)  as  well  as  subject  population  (Ref  43« 
230-237).  The  results  are  listed  in  Table  III  on  page  36.  The  signifi¬ 
cant  results  of  the  treatment  indicate  that  the  subject  identification 
errors  were  due  to  the  treatment,  i.e.,  there  is  non-subject  variability. 
The  non-significant  results  of  the  subject  population  indicate  that  the 
identification  errors  came  from  a  homogeneous  population.  These  results 
validate  the  experimental  paradigm  and  allow  further  analysis. 

The  Pearson  product  moment  coefficient  of  correlation  (r)  was  com¬ 
puted  to  correlate  the  total  subject  identification  errors  with  the 
Euclidean  distance  of  each  letter  at  angles  of  0,  30,  60,  and  90  degrees. 
Angles  of  120  and  150  degrees  were  not  used  because  the  maximum  variance 
of  Euclidean  distance  occurs  in  a  90°  arc  due  to  symmetry  properties  of 
the  Fourier  transform.  (This  property  is  illustrated  by  the  fall-off  of 
the  Euclidean  distance  at  120°  and  150°  in  the  Appendix  C tables.)  The 
coefficient  of  determination  (d),  where  d  =  r  ,  is  also  computed  for  a 
more  valid  interpretation  of  the  correlation  coefficient  (Ref  43:79). 

The  level  of  significance  of  r  and  d  was  set  as  0.20,  a  nominal  level 
for  exploratory  research  (Ref  43:155).  The  values  of  r  and  d  for  each 
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Table  II 


Total  Subject  Identification  Errors  for  Each  Letter  at 
Angles  of  0,  30,  60,  90,  120,  and  150  Degrees  from  the 
Identification  of  Rotated  Letters  Experiment 


Total  Identification 
Each  Angle  for  Each 

Errors 

Letter 

at 

Letter 

0° 

30° 

60° 

90° 

120° 

150° 

A 

9 

13 

12 

10 

12 

13 

B 

9 

10 

9 

7 

10 

13 

C 

10 

13 

14 

13 

13 

14 

D 

3 

7 

6 

9 

8 

8 

E 

7 

9 

8 

9 

10 

12 

F 

7 

15 

15 

14 

13 

14 

G 

9 

10 

10 

10 

12 

12 

H 

9 

9 

8 

8 

11 

15 

I 

5 

9 

9 

9 

10 

CM 

r—i 

J 

11 

10 

11 

11 

11 

13 

K 

11 

9 

10 

11 

13 

14 

L 

5 

7 

8 

8 

10 

10 

M 

5 

6 

6 

6 

7 

12 

N 

6 

8 

5 

5 

5 

10 

0 

6 

4 

4 

5 

4 

4 

P 

5 

7 

10 

9 

11 

13 

Q 

12 

14 

15 

14 

14 

14 

R 

8 

8 

11 

12 

11 

13 

S 

6 

7 

6 

6 

11 

11 

T 

7 

9 

10 

11 

11 

11 

U 

2 

3 

4 

4 

5 

7 

V 

7 

7 

5 

4 

? 

11 

w 

3 

5 

5 

3 

7 

10 

X 

8 

6 

7 

7 

10 

12 

Y 

4 

6 

8 

7 

9 

9 

Z 

4 

6 

7 

7 

12  • 

12 

Total  Errors 

178 

217 

223 

219 

254 

299 

r*r  /rc  /tip  n 
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Table  III 

Analysis  of  Variance  with  Respect  to  Treatment  and 
Subject  Population  of  Identification  of  Rotated  Letters  Experiment 


Analysis  of  Variance 
with  Respect  to  Treatment 


Sum  of  Squares 


Between 

Groups 

1669.77 

5 

Within 

Groups 

1351.60 

2A 

TOTAL 

3021.37 

29 

Mean  Square 
333.95 

56.32 


p  >  0.05  =  2.62 
p  >  0.01  =  3.90 


p  >  0.05  =  2.76 
p  >  0.01  =  4.13 


F  Ratio 
5.93* 


Analysis 
with  Respect  to 

of  Variance 

Subject  Population 

Sum  of  Squares 

DF 

Mean  Square 

F  Ratio 

Between 

Groups 

924.87 

4 

231.22 

2.76* 

Within 

Groups 

2096.50 

25 

83.86 

TOTAL 

3021.37 

29 
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filter  size  and  shape  of  each  letter  are  shown  in  Table  IV  on  the  fol¬ 
lowing  page.  The  3X3  square  and  the  D*5  square  and  circle  filters  have 
14  letters  above  0.80  when  the  correlation  coefficient  is  used,  whereas 
the  7X7  square  and  the  9x9  square  and  circle  filter  have  13  letters  above 
0.80  when  the  coefficient  of  determination  is  used.  While  the  null 
hypothesis  cannot  be  rejected  for  each  letter,  the  large  number  of 
letters  with  correlations  above  0.80  is  significant  and  supports  the 
validity  of  the  model  as  providing  a  metric  for  human  invariance  of 
rotation. 

Discussion.  An  underlying  assumption  for  the  use  of  the  Pearson 
product-moment  coefficient  of  correlation  is  that  the  relationship 
between  the  two  variables  be  linear  and,  as  a  rule,  linearity  can  be 
determined  by  inspecting  a  scatter  diagram  (Ref  19:98-103).  But  this 
is  not  meaningful  with  a  small  sample  size  such  as  four  that  was  avail¬ 
able  in  correlating  each  letter.  Thus,  the  coefficient  of  correlation 
results  must  be  tempered  with  assumed  linearity  between  the  two  variables. 
However,  the  fact  that  the  coefficients  of  determination  were  quantita¬ 
tively  similar  to  the  coefficients  of  correlation  and  represent  the  pro¬ 
portion  of  the  variance  of  a  variable  which  may  be  predicted  by  another 
variable  tends  to  substantiate  the  results.  Other  qualifications  that 
should  be  included  are  a  single  non-representative  subject  population  and 
only  one  stimulus  class. 

The  seven  letters  that  consistently  have  negative  coefficient  of 
correlations  are  found  to  have  the  highest  identification  errors  at  0°  and 
30°.  It  is  believed  that  these  errors  are  mainly  due  to  the  experimental 
paradigm  and  could  be  eliminated  with  a  counter-balancing  technique.  For 
example,  the  letter  X  was  frequently  mistaken  for  the  letter  Z  as  was 
the  letter  V  for  the  letter  A.  Additionally,  Conrad  has  demonstrated 
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that  acoustic  confusions  exist  with  visually  presented  letters,  i.e., 
identification  errors  of  letters  may  be  a  function  of  acoustic  similarity 
as  well  as  st-rnr+urel  similarity  (P.cf  1C).  This  is  another  factor  which 
must  be  considered  in  human  letter  identification  errors.  However,  a 
comparison  with  Conrad’s  table  of  listening  errors  with  the  confusion 
matrices  of  the  identification  errors  shown  in  Appendix  B  reveals  little 
similarity,  and  the  identification  errors  reported  in  this  paper  will  be 
mainly  assumed  to  be  due  to  structural  similarity. 

Analysis  of  the  filter  shape  with  respect  to  the  coefficient  of 
determination  reveals  that  the  square  filter  exceeded  the  circle  filter 
in  a  number  of  letters  with  values  above  0.80  for  the  3><3,  5X5,  and  7X7 
filters.  The  results  for  the  9x9  square  and  circle  filter  were  equal. 
This  is  to  be  expected;  as  the  filter  size  is  increased,  the  higher 
spatial  frequencies  added  by  the  square  corners  become  less  significant. 
The  ramifications  of  the  square  filter  as  optimal  with  respect  to  the 
circle  filter  will  be  discussed  in  a  later  section. 

One  final  point  is  that  the  Euclidean  distance  variations  of  rotated 
patterns  can  De  quantified  in  terms  of  statistical  variance  whereby 


(14) 


where  x  =  deviation  from  the  sample  mean  and  N  =  sample  size.  Hence, 
the  variance  over  a  90°  arc,  computed  for  the  Euclidean  distance  at  30°, 
60°,  and  90°  for  the  letters  0,  I,  and  Z  are  3.35,  49.45,  and  109.36, 
respectively.  This  would  appear  to  be  a  meaningful  metric  to  be  used  in 
quantifying  pattern  shape  variations  under  rotation.  The  disparity  of 
variance  between  the  letters  0  and  Z  is  an  intuitively  satisfying  result 
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since  the  letter  0  is  circular  (actually,  it  was  slightly  elliptical), 
whereas  the  letter  L  is  highly  angular. 

It  is  concluded  that  the  correlation  results  support  the  validity 
of  the  model  in  providing  a  matric  of  human  invariance  of  form  perception 
under  rotation  and  further  investigation  is  warranted. 
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V.  Correlates  to  the  Gestalt  Principles  of 
Perceptual  Organization  and  Geometric  Illusion 

Gestalt  Principles  of  Perceptual  Oroanl zatlon 

Introduction.  The  Gestalt  psychologists  have  long  been  concerned  with 
perceptual  organizations.  Werthiemer,  who  initiated  the  Gestalt  movement 
in  psychology  in  1923  argued  that  perceptual  inputs  are  actively  organized 
not  passively  detected  (Ref  50).  He  went  on  to  define  the  principles  of 
organization;  three  of  them  ares  proximity  (closest  elements  form  groups), 
similarity  .( similar  elements  form  groups),  and  closure  (fragments  form 
wholes).  Additionally,  he  noted  that  the  effect  of  proximity  is  stronger 
than  the  effect  of  similarity.  Another  concern  of  the  Gestalt  psycho¬ 
logists  has  been  perceptual  figure-ground  separation,  whereby  the  figures 
in  the  visual  field  are  perceived  as  a  whole,  separate  from  the  remaining 
field.  An  electrical  engineering  analogy  is  the  separation  of  signal 
from  noise.  An  important  point  of  these  principles  is  that  they  are 
inadvertent  concomitants  to  pattern  identification.  As  Evans  points  out, 
the  perceptual  organization  ...  "is  accomplished  without  any  awareness  on 
our  part  and  we  normally  take  it  as  gjven"  (Ref  13:4). 

The  Gestalt  principles  are  criticized  mainly  because  they  are 
intuitive  descriptions  that  cannot  identify  the  perceptual  factors  that 
cause  them.  While  computer  techniques  such  as  Zahn's  (Ref  52) ,  have 
been  used  to  describe  some  of  these  principles  in  a  rigorous  mathematical 
way,  no  one--to  the  author's  knowledge--has  provided  a  unified  explana¬ 
tion  of  all  these  principles  with  a  model  such  as  that  reported  in  this 
paper.  If  the  model  is  valid,  then  it  should  explain  the  Gestalt 
principles  in  terms  of  the  characteristics  of  low-pass  spatially  filtered 
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transforms.  Therefore,  an  initial  investigation  of  the  Gestalt  principles 
was  undertaken  using  Fourier  optics  techniques. 

Research  Method.  A  KMS  Industries  two-lens  optical  correlator  was 
arranged  to  perform  the  Fourier  transform.  A  helium-neon  laser  provided 
the  coherent  light  (wavelength  =  6.328  x  10-4  mm).  One-half  inch  dot 
patterns,  made  from  pin  holes  in  aluminum  sheets,  were  inserted  in  the 
object  plane}  spatial  filtering  was  accomplished  by  mask  filters  in  the 
focal  plane}  and  the  reconstructed  image  from  the  inverse  spatially 
filtered  Fourier  transform  was  obtained  at  the  image  plane.  The  200  p 
and  300  p  square  and  circle  mask  filters  pass  the  dc  and  first  three 
and  the  dc  and  first  five  spatial  harmonics,  respectively,  when 
referenced  to  a  half-inch  pattern  size. 

The  reconstructed  images  were  photographed  with  Polaroid  4x5  land 

film  (type  55P/N)  by  a  camera  inserted  in  the  image  plane.  Exposure 

-3 

times  varied  from  5  x  10  sec.  for  the  brighter  images  to  1  sec.  for  the 
weaker  images,  and  the  laser  beam  was  attenuated  with  neutral  density 
filters  for  the  brightest  images.  The  changes  in  exposure  time  and 
attenuation  can  be  considered  to  be  analogs  to  the  energy  normalization 
used  by  ths  computer  model  and  pupillary  control  of  the  human  eye.  In 
the  following  photographs,  the  letters  a,  b,  and  c  will  denote  the  object 
pattern,  spatially  filtered  transform,  and  the  reconstructed  image, 
respectively}  photographs  of  images  obtained  with  the  spatial  filter 
tilted  45°  are  indicated  by  bt  and  ct  (in  Figs.  12  through  18). 

Optical  low-pass  spatial  filtering  is  normally  accomplished  with 
circular  spatial  filters.  However,  the  computer  model  has  been  using 
square  spatial  filtering  and  since  the  results  of  the  invariance  of  form 
perception  under  rotation  appeared  more  significant  with  a  square  filter, 
optical  spatial  filtering  was  accomplished  with  square  spatial  filters. 
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This  decision  was  indeed  fortunate  for  reasons  that  will  become  apparent 
later.  The  circle  filters  were  constructed  by  holes  punched  in  aluminum 
sheet  metal,  whereas  the  square  filters  were  constructed  using  micro¬ 
photography  techniques. 

Results.  The  gross  features  of  the  reconstructed  image  are  little 
affected  by  the  filter  shape.  This  effect  and  the  principle  of  closure 
and  illustrated  by  Fig.  12,  on  the  following  page,  with  a  dot  G  and  dot 
form  using  200  )i  square  and  circle  filters.  The  Gestalt  "whole,"  a  uni¬ 
form  distribution  of  object  dots  closing  to  form  one  group,  is  illus¬ 
trated  in  Fig.  13.  (The  object  dots  were  not  perfectly  uniform  in  size 
or  arrangement,  but  they  do  convey  the  concept.)  The  principle  of 
proximity  is  illustrated  in  Fig.  14,  where  uniformly  shifted  object  dots 
form  three  groups.  The  principle  of  similarity  is  illustrated  in  Fig.  15 
(on  page  45 ) ,  where  two  rows  of  small  object  dots  between  two  rows  of 
larger  object  dots  form  groups.  The  effect  of  proximity  over  similarity 
is  illustrated  by  Fig.  16,  with  the  grouping  of  dots  and  lines.  Note 
the  decreased  effects  of  closure,  proximity,  and  similarity  when  the 
filter  is  tilted  45°.  These  differences  will  be  discussed  later.  The 

figure-ground  principle  is  illustrated  with  a  dot  object  R  embedded  in 

* 

dot  noise.  Figure  17a  contains  large  dot  R  with  small  dot  noise,  whereas 
Fig.  17b  contains  R  and  dot  noise  constructed  with  equal  size  dots.  The 
figure-ground  principle  is  also  illustrated  with  the  F/B  object  in  Fig. 
18.  Note  the  brightness  of  the  F  versus  the  B  due  to  similarity. 

Discussion.  The  effects  of  spatial  filtering  on  proximity  are 
precisely  those  described  by  Hocbberg  and  Handy,  whereby  proximity  pro¬ 
duces  a  commensurate  intrarow  increase  in  brightness  differences  which, 
in  turn,  produced  alternating  dark  and  light  columns  that  reorganized 
perception  from  rows  to  columns  (Ref  22). 


Fig.  12 

Principle  of  Closure  Illustrated  by  Dot  G  and  Dot  Form 
(200  p  Circle  and  Square  Filter) 


Fig.  13 

Gestalt  "Whole"  Illustrated  by  Dot  Pattern  (Square  Filter) 


Pig.  I'/ 

Figure-Ground  Illustrated  by  Dot  R  in  Dot  Moise 
(200  n  Square  Filter) 


Fig.  18 

Figure-Ground  Illustrated  by  F/B  Dot  and  Bar  Pattern 
(200  (i  Square  Filter) 
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rho'vo  results  strongly  validate  the  model  wheieby  illustrating  the 
principles  of  perception  that  the  Gestalt  psychologists  have  been 
describing  for  almost  50  years.  The  grouping,  closure,  and  figure-ground 
relationships  have  occurred  not  because  of  a  specified  algorithm  for 
each  of  these  tasks,  but  rather  as  an  inadvertent  concomitancy  of 
processing  for  a  pattern  recognition  algorithm,  i.e.,  low-pass  spatial 
filtering. 

These  results  also  suggest  that  the  problem  of  describing  perceiveu 
texture  is  amenable  to  low-pass  spatial  f'cering  techniques  as  well  as 
being  able  to  describe  texture  in  terms  of  proximity,  similarity,  and 
brightness  distribution.  An  obvious  next  question  iss  What  are  the 
parameters  of  low-pass  spatial  filtering  that  group  and  close  pattern 
elements  in  terms  of  proximity  and  similarity?  To  answer  this  question 
requires  a  return  to  spatial  filtering  theory. 

Spatial  Filtering  Revisited 

As  previously  discussed,  the  reconstructed  image  from  a  spatially 
filtered  transform  can  be  interpreted  as  an  interference  pattern  derived 
from  summed,  transformed,  nnfiltered  spatial  frequencies.  Additionally, 
the  Fourier  transform  spreads  detail  features  primarily  at  the  outer 
portion  of  the  transform,  whereas  the  gross  features  are  primarily 
localized  at  the  central  portion  of  the  transform.  With  pattern  feature 
information  arranged  in  this  manner,  the  question  arises:  How 'does  the 
spatial  filter  effect  pattern  features  that  will  be  reconstructed  from  a 
spatially  filtered  transform?  There  are  several  criteria  which  govern 
image  feature  resolution,  e.g.,  the  Rayleigh  criteria  of  resolution  for 
discriminating  between  two  point  sources.  Nevertheless,  to  be  consistent 
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with  the  filter  properties  of  the  model,  image  feature  reconstruction 
will  be  interpreted  in  terms  of  spatial  filter  bandwidth  (filter  size). 

Filter  bandwidth  is  generally  defined  as  the  frequency  distance 
between  two  half-power  points  where  a  half-power  point  is  0.707  of  the 
maximum  amplitude.  Amplitude  squared  yields  intensity,  a  term  more  com¬ 
monly  found  in  optics  literature.  Thus,  the  filter  bandwidth  may  also 
be  defined  as  the  frequency  distance  between  two  half-intensity  points, 
i.e.,  the  half-width  of  the  maximum,  where  the  half-width  is  0.5  of  the 
maximum  intensity  (Ref  6?348). 

The  function  of  the  filter  bandwidth  in  resolving  image  features 
will  be  explained  with  the  aid  of  Fig.  19«  Consider  two  resolved  image 


features  represented  by  intensity  wave  patterns  at  and  X^,  where  the 
distance  X^  -  X2  is  larger  than  the  filter  bandwidth  (Fig.  19a).  As 
the  distance  X^  -  X2  decreases,  the  wave  patterns  re-enforce  each 
other,  the  average  intensities  of  X^  and  X^  merge  and  the  images  at  X^ 
and  X2  cannot  be  resolved  (Fig.  19b,  c).  The  same  effect  of  unresolved 
images  will  occur  if  the  object  positions  remain  fixed  and  the  filter 
bandwidth  is  decreased.  A  decreased  filter  bandwidth  blocks  the  higher 
spatial  frequencies  which  contain  the  necessary  intensity  distribution 
required  for  image  feature  reconstruction,  hence,  image  feature  resolution. 
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This  heuristic  development  can  be  quantified  as  follows,  it  is  well 
known  (Ref  17:63)  that  the  width  of  the  central  maxima  of  the  Faunhoffer 
diffraction  pattern  for  a  square  aperture  is  given  by 


AX  =  2 


where  \  =  wavelength  of  light  source 

d  =  distance  from  aperture  to  the  image  plane 

l  =  aperture  width 

and  the  half-width  of  the  central  maximum  is  given  by 


Similarly,  the  half-width  of  the  central  maximum  of  the  Airy  diffraction 
pattern  for  a  '-ircle  aperture  is  given  by 


AX  =  1.22  J- 


where  l  =  diameter  of  the  circular  aperture 
Consider  the  following  parameters: 
l  =  1/16  in.  square  or  circle  pattern  feature 
\  =  6.328  *  10-7  meters  (for  a  neon-helium  laser) 
d  =  20  in. 

Using  these  values  for  Eqs  (16)  and  (17)  results  in  the  half-width 
of  the  central  maximum 

AX  —  202  *  10-6  meters  for  a  square  pattern  feature- 
=  246  x  10"6  meters  for  a  round  pattern  feature. 

It  follows  then  that  a  200  p  low-pass  spatial  filter  would  be  expected  to 
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resolve  a  1/16  in.  square  pattern  feature,  although  it  would  not  be 
expected  to  resolve  a  1/16  in.  circular  pattern  feature  using  the  half¬ 
width  criteria.  Therefore (  equating  spatial  filter  also  (f^)  "l ths 
half-width  of  the  central  maximum  results  in 

fx  =  f  (18) 

which  is  a  generalized  relationship  between  the  spatial  filter  size  and  the 
resolvable  pattern  feature  size  in  terms  of  the  optical  parameters. 

It  is  desirable  to  quantize  the  resolvable  pattern  feature  size  in 
terms  of  spatial  filter  bandwidth.  A  convenient  reference  for  pattern 
feature  sizes  would  be  the  overall  pattern  size.  Thus,  for  a  l/2  in. 
pattern  (£)  with  the  preceding  optical  parameters  and  a  200  p  low-pass 
spatial  filter,  the  spatial  filter  bandwidth  (BW)  can  be  obtained  by 
computing  the  width  (W),  the  diffraction  pattern  zero  crossings,  from 


(19) 

»  8 

Let  the  harmonics  passed  by  the  spatial  filter  be  obtained  from 


°“=! 


(20) 


=  4  spatial  harmonics 


Thus,  the  spatial  filter  bandwidth  passes  the  dc  and  first  three  spatial 
harmonics,  and  is  designated  the  7*7  low-pass  spatial  filter  of  the 
Tallman-Radoy  algorithm.  In  sum,  a  200  p  low-pass  spatial  filter  will 
pass  the  dc  and  first  three  spatial  harmonics  of  a  diffraction  pattern 
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from  a  1/2  in.  object  pattern  and  will  resolve  pattern  features  approxi¬ 
mately  greater  than  1/16  in.  in  size  using  the  preceding  optical  parame¬ 
ters.  Thocu  are  nanaral  anliit.ions  for  simple  casest  however,  it  does 
appe'r  conceptually  valid  and  desirable  to  quantify  pattern  features  in 
the  preceding  manner. 

Note  that  there  is  an  inverse  relationship  between  the  feature  size 
and  the  half-width  of  the  central  maximum  from  Eq  (16).  Thus,  the  larger 
feature  size  produces  the  smaller  diffraction  pattern,  whereas  the  smaller 
feature  produces  the  larger  diffraction  pattern  as  Fig.  20  illustrates. 


Di f fraction  Pattern 
of  l/K'  in.  Square- 


Fig.  20 

Diffraction  Patterns  of  a  1/16  in.  Square  and  a  l/2  in.  Square 
with  Superimposed  100  |X  Square  Spatial  Filter 


Also  note  the  superimposed  100  |i  square  spatial  filter  on  these  diffrac¬ 
tion  patterns.  The  100  p  filter  easily  encloses  the  central  maximum  as 
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well  as  higher  spatial  frequencies  of  the  smaller  diffraction  pattern, 
whereas  the  100  p  filter  does  not  even  enclose  the  half-width  of  the 
central  maximum  of  the  larger  diffraction  palLcjn.  Thus,  the  l/2  in. 
pattern  would  be  resolveds  however,  the  l/l6  in.  pattern  would  not  be 
resolved.  The  cutoff  of  the  spatial  filters  appears  very  sharp.  Note 
the  four  dots  that  make  up  the  upper  curved  portion  of  the  R,  recon¬ 
structed  from  a  200  p  spatial  filter  shown  in  Fig.  21.  The  resolved 
gap  is  slightly  more  than  l/8  in.  apart,  whereas  the  other  two  unresolved 
gaps  are  exactly  1/8  in.  apart. 


Fig.  21 

Low-Pass  Spatially  Filtered  Dot  R 
(200  p  Circle  Filter) 

In  sum,  it  was  the  spatial  filter  bandwidth  that  blocked  the  spatial 
frequencies  above  the  diffraction  pattern  central  maxima  of  the  Gestalt 
figure  features  that  produced  the  grouping  effects  that  the  Gestalt 
psychologists  term  principles  of  perceptual  organization. 

Whereas  the  Gestalt  principles,  in  terms  of  image  resolution,  can 
be  considered  to  be  a  function  of  the  spatial  filter  bandwidth,  there 
appears  to  be  other  ramifications  of  the  spatial  filter.  If  pattern 
recognition  is  defined  in  terms  of  feature  discrimination  and  the  feature 
space  is  considered  to  be  the  transform  domain,  then  feature  discrimination 
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ia  terms  of  reconstructed  image  resolution  can  be  described  in  terms  of 
spatial  filter  bandwidth.  It  was  illustrated  that  the  larger  the 
pattern  feature,  the  smaller  the  transform  diffraction  pattern}  con¬ 
versely,  the  smaller  the  pattern  feature,  the  larger  the  transform  dif¬ 
fraction  pattern.  Therefore,  if  the  discriminating  pattern  feature  is 
large  (contour),  then  a  small  filter  size  which  allows  only  the  half¬ 
power  of  the  central  maximum  of  that  feature's  diffraction  pattern  is 
required  for  image  resolution.  If  the  discriminating  pattern  feature 
is  small  (inflection),  then  a  larger  filter  size  must  be  used  to  allow 
the  half-power  of  the  central  maximum  of  that  feature  diffraction  pattern 
to  pass  for  image  resolution.  It  is  realized  that  the  preceding  dis¬ 
cussion  has  greatly  simplified  a  complex  human  perceptual  phenomenon. 
Whether  the  half-power  criteris  is  valid  for  human  feature  discrimina¬ 
tion  remains  to  be  proven.  However,  the  concepts  of  object/image  size 
in  terms  of  spatial  filter  bandwidth  appear  to  be  a  good  starting  point 
for  quantifying  the  metrics  of  visual  form  in  a  manner  that  has  been 
demonstrated  to  show  psychological  correlates.  Quine  is  presently 
quantifying  these  concepts  (Ref  41). 

Now  that  image  feature  resolution  in  terms  of  spatial  filtei  band¬ 
width  is  better  understood,  an  investigation  of  geometric  illusions 
with  spatial  filters  will  be  pursued. 

Geometric  Illusions 

Introduction.  Psychologists  have  been  interested  in  geometric 
illusions  mainly  because  perceptual  failures,  as  well  as  perceptual 
successes,  provide  clues  to  the  visual  processes.  While  many  theories 
have  been  proposed  to  explain  the  distortions  of  simple  figures,  none  is 
widely  accepted  (Ref  18:141-150) .  As  Ittelson  points  out,  the  current 
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status  of  visual  illusions  may  be  summarized  best  by  a  quote  that  Luckiesh 
wrote  in  his  classic  volume  "Visual  Illusions"  45  years  ago:  "Some 
theoretical  aspects  of  the  subject  are  still  extremely  controversial" 

(Ref  32Jviii). 

If  the  model  postulated  in  this  paper  is  valid,  then  it  should 
provide  insight  into  these  perceptual  abberations  in  terms  of  low-pass 
spatial  filtering.  Therefore,  this  section  is  concerned  with  an  initial 
investigation  to  support  this  viewpoint. 

Research  Method.  The  geometric  illusions  investigated  were  con¬ 
structed  from  black  electrical  tape,  aluminum  foil  or  exposed  photo¬ 
graphic  film,  and  ranged  in  size  from  0.5  in.  to  0.75  in.  The  construc¬ 
tion  method  depended  primarily  upon  the  complexity  of  the  figure  and  ease 
of  construction  each  material  afforded.  The  geometric  illusions  were  low- 
pass  spatially  filtered  with  the  same  Fourier  optics  techniques  and 
spatial  filters  as  were  the  Gestalt  figures.  As  before,  the  letters  a, 
b,  and  c  will  denote  the  object,  spatially  filtered  transform,  and  recon¬ 
structed  image,  respectively,  for  the  following  photographs. 

Results.  Perhaps  the  best  known  man-made  geometric  illusion  is  the 
Muller-Lyer  illusion  illustrated  in  Fig.  22a,  whereby  equal  line  lengths 
are  perceived  as  unequal.  (This  illusion,  shown  on  the  following  page, 
was  constructed  from  black  electrical  tape;  thus  the  bottom  line  lengths 
are  not  exactly  equal.)  The  reconstructed  image  of  Fig.  22c  illustrates 
the  unequal  line  lengths  resulting  from  low-pass  spatial  filtering.  It 
is  well  known  that  the  Muller-Lyer  illusion  does  not  depend  upon  the 
lines,  whereas  it  does  depend  upon  the  angle  of  the  arrow  tips  (Ref  18: 
140).  This  fact  is  illustrated  by  Fig.  23.  (This  illusion  was  con¬ 
structed  by  slitting  aluminum  foil?  thus  both  spaces  between  the  arrow 
tips  are  more  equal  than  the  preceding  illusion.) 
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The  mechanism  for  the  line  length  and  distance  changes  in  the 
Muller-Lyer  illusion  is  the  unresolved  arrow  tips  resulting  from  low- 
pass  spatial  filtering.  In  the  first  illusion,  the  acute  angles  are 
smaller  feature?  than  the  obtuse  anales  and  are  resolved  less,  which 
results  in  line  contractions;  since  the  obtuse  angles  are  unresolved 
also,  line  expansion  results.  The  effect  of  the  angle  size  is  illus¬ 
trated  more  effectively  by  the  reconstructed  image  from  part  of  a 
spatially  filtered  Herring  illusion  illustrated  by  Fig.  24.  Note  that 
the  obtuse  angles  are  more  resolved  than  the  acute  angles.  The  distance 
changes  of  the  second  Muller-Lyer  illusion  are  due  to  a  slightly  dif¬ 
ferent  reason.  The  point  of  the  arrow  tips  is  not  resolved  due  to  low- 
pass  spatial  filtering;  thus,  the  reconstructed  images  favor  tho  larger 
features  of  the  arrow  tips,  located  behind  each  arrow  point. 

The  Ponzo  illusion,  illustrated  by  Fig.  25,  also  perceptually  dis¬ 
torts  equal  line  lengths,  whereby  the  upper  horizontal  line  is  perceived 
as  being  longer  than  the  lower  horizontal  line.  This  distortion  is  also 
accomplished  by  low-pass  spatial  filtering  as  Fig.  25c  illustrates.  In 
this  illusion,  it  appears  that  proximity  is  the  reason  for  the  line 
length  change.  The  smaller  spaces  between  the  horizontal  and  vertical 
lines  cannot  be  resolved,  thus  extending  the  closest  (upper)  horizontal 
line  tips  to  the  vertical  lines. 

Discussion.  It  appears  from  this  preliminary  investigation  that 
most  geometric  illusions  can  be  explained  by  unresolved  image  features 
of  low-pass  spatially  filtered  transforms  in  terms  of  proximity  and  angle. 
Postulating  the  human  perceptual  space  pc  the  reconstructed  image  domain, 
it  becomes  apparent  that  the  line  length  changes  and  the  distance  changes 
are  perceived  as  such  because  they  actually  do  occur.  The  model  "fails" 
in  the  same  way  as  humans  "fail"  with  these  illusions. 
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IF  one  considers  perceptual  depth  as  a  function  of  image  intensity, 
then  the  preceding  explanation  of  the  Muller-Lyer  and  Ponzo  illusions  are 
not  too  unlike  Gregory's  explanation.  Gregory  considers  the  distortion 
of  visual  space  to  be  due  to  inappropriate  constancy  scaling,  whereby 
apparent  depth  from  corners  and  angles  is  responsible  for  the  illusion 
(Ref  18:147-160).  Additionally,  it  is  interesting  to  note  that  the  dif¬ 
ferent  line  lengths  and  distances  of  the  reconstructed  images  would  be 
amenable  to  the  moment  measurement  techniques  reported  by  Gill  (Ref  15). 

Now  that  the  model  appeared  conceptually  valid,  another  look  was 
taken  in  regard  to  the  spatial  filter  shape  and  perceptual  organization. 
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VI.  Extensions  of  the  K3brisky  Model 

Spatial  Filter  Shape  Correlates 

Introduction .  This  section  is  a  post  hoc  analysis  of  the  psycho¬ 
logical  correlates  of  the  preceding  investigations.  The  filter  shape 
appeared  to  play  a  significant  role  in  the  letter  identification  under 
rotation  experiment  and  the  Gestalt  principles  of  perceptual  organiza¬ 
tion.  Therefore,  the  filter  shape  is  investigated  further  for  addi¬ 
tional  psychological  correlates  of  perceptual  organization. 

The  Square  Filter.  It  was  noted  that  the  square  filter  provided 
more  significant  correlations  than  the  circle  filter  for  the  lower 
filter  sizes  in  the  letter  identification  under  rotation  experiment. 

The  square  filter  accentuates  the  effect  of  the  higher  spatial  fre¬ 
quencies  at  the  diagonals  of  the  square  filter.  Thus,  the  Euclidean 
distances  for  the  letters  did  not  monotonically  increase  as  rotation 
increased  and  the  square  filter  had  more  maximum  Euclidean  distances  at 
60°  than  the  round  filter  had  at  60°,  as  Table  V  illustrates. 

Table  V 

Number  of  Maximum  Euclidean  Distances  at  60° 


Spatial  Filter 

Shape 

3X3 

5x5 

7X7 

9x9 

Circle 

2 

11 

10 

10 

Square 

12 

15 

14 

10 
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As  previously  mentioned,  the  human  visual  system  is  spatially  polar¬ 
ized  with  form  recognition  superior  in  the  horizontal  and  vertical 
orientation  as  compared  to  oblique  orientations.  Arnoult  showed  graphs 
illustrating  the  accuracy  and  latency  of  shape  discrimination  with  curvi¬ 
linear  shapes  as  a  function  of  angular  orientation  to  be  non-monotonic 
through  0°  to  180°  clockwise  and  counter-clockwise  (Ref  l).  Discrimina¬ 
tion  times  with  lines  and  rectangles  are  reported  by  Attneave  and  Olson 
as  being  faster  with  horizontal  and  vertical  orientation  than  with 
orientations  45°  to  the  right  or  left  (Ref  3).  Taylor,  in  discussing 
several  visual  discrimination  and  orientation  experiments,  illustrates 
non-monotonic  discrimination  results  with  orientation  as  well  as  the  non¬ 
monotonic  ability  to  assess  orientation  (Ref  48).  These  parallel  results 
from  these  two  different  tasks  led  him  to  suggest  that  meridional  dif¬ 
ferences  of  discrimination  may  be  due  to  the  difficulty  of  orientation 
assessment.  It  will  be  assumed  that  the  spatial  filter  shape  of  the 
model  will  be  considered  to  play  a  significant  part  in  meridional  dif- 

A 

ferences.  The  assumption  will  be  re-enforced  with  the  more  basic  psycho¬ 
logical  correlates  concerning  the  Gestalt  principles. 

In  reporting  the  results  of  the  low-pass  spatial  filtering  of  the 
Gestalt  patterns,  it  was  also  noted  that  the  grouping  effects  due  to 
proximity  and  similarity  decreased  when  the  patterns  were  rotated  45°  with 
respect  to  the  square  filter.  Hake  has  summarized  Rush' s  experiments  on 
the  relative  strength  of  the  Gestalt  principles  of  grouping  with  dot 
matrices  by  stating  that  the  influence  of  proximity  and  similarity  is 
affected  by  whether  the  dot  matrix  is  horizontal,  vertical,  or  oblique 
(Ref  21:57-58).  By  changing  the  distances  between  the  dots  of  the  dot 
matrices,  Rush  noted  that  proximity  was  slightly  more  effective  in  the 
vertical  and  oblique  than  in  the  horizontal,  where? s  similarity  was  more 
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effective  in  the  oblique  and  least  effective  in  the  vertical.  Olson  and 
Attneave  have  reported  similar  results  in  that  arrays  of  several  figural 
or  textural  variables  produced  different  similarity  groupings  which 
depended  upon  the  orientation  of  +Ha  stimulus  (Ref  36).  The  stimulus 
differed  in  slope  and  linearity  of  elements  as  well  as  combinations  of 
slopes  into  angles.  The  effectiveness  of  similarity  under  these  con¬ 
ditions  was  determined  by  subject  reaction  time  in  locating  a  disparate 
quadrant  in  a  circular  array  with  and  without  head  tilt  at  45°.  The 
results  were  that  slope  differences  produced  better  grouping  than  line¬ 
arity  and  arrangements  of  slopes  and  when  the  whole  array  was  oriented 
at  different  angles,  the  horizontal  and  verticals  gave  better  groupings 
than  diagonals.  They  further  noted  that  the  reaction  time  was  a  function 
of  both  retinal  and  gravitational  orientation.  Whereas  the  effect  of 
gravitation  orientation  effects  was  more  consistent  than  the  effect  of 
retinal  orientation,  the  retinal  orientation  affected  the  mean  reaction 
time  more  than  twice  as  much  as  the  physical  orientation.  The  important 
points  are  that  the  retinal  orientation  of  these  stimuli  do  affect  simi¬ 
larity  grouping  and  cannot  be  accounted  for  entirely  on  a  projection 
level  but  as  Olson  and  Attneave  suggest  "...  grouping  and  segregation 
depend  on  identities  and  differences  between  descriptors  which  in  turn 
represent  relationships  between  the  stimulus  array  and  an  internal 
Cartesian  reference  system."  It  is  suggested  that  this  "internal 
Cartesian  reference  system"  can  be  correlated  to  the  Kabrisky  model  in 
terms  of  the  horizontal  and  vertical  axes  of  the  spatial  filter. 

These  psychological  investigations  just  discussed  suggest  a  cor¬ 
relate  in  terms  of  the  model  spatial  filter  shape.  If  one  considers  the 
change  which  the  spatial  transform  undergoes  during  rotation  in  a  square 
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filter,  it  becomes  apparent  that  the  maximum  change  of  the  vertical  and 
horizontal  components  would  occur  at.  the  diagonals  of  the  square.  Thus, 
for  some  forms,  the  image  from  the  rotated  object  would  have  the  largest 
deviation  (in  terms  of  spatial  frequencies)  from  a  stored  prototype  image. 
Additionally,  the  increased  higher  spatial  harmonics  from  the  diagonals 
would  reduce  the  Gestalt  groupings  as  previously  illustrated.  It  is 
suggested  that  these  factors  could  account  for  the  non-monotonic  dis¬ 
crimination  ability  as  well  as  the  changes  in  the  Gestalt  groupings  under 
pattern  rotation.  However,  the  previous  psychological  investigations  also 
note  perceptual  changes  with  horizontal  versus  vertical  orientation  that 
cannot  be  accounted  for  by  a  square  filter  that  would  resolve  both  the 
horizontal  and  vertical  components  of  a  bilaterally  symmetrical  pattern 
equally.  An  example  of  a  non-symmetrical  perceptual  abberation  is  per¬ 
haps  the  simplest  geometric  illusion--the  horizontal-vertical  line 
illusion  shown  in  Fig.  26.  The  vertical  line  appears  to  be  longer  than 


Fig.  26 

Horizontal-Vertical  Line  Illusion 

the  horizontal  line  even  though  both  lines  are  the  same  length.  It  is 
interesting  to  observe  that  the  illusion  is  diminished  if  the  figure  is 
rotated  by  90°.  Clearly,  with  respect  to  the  model,  these  effects  cannot 
be  explained  in  terms  of  a  square  shaped  spatial  filter. 
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The  Rectangle  Filter.  It  would  appear  that  a  rectangular  spatial 
filter  shape  is  required  to  solve  the  preceding  problems.  The  disparate 
horizontal  and  vertical  lengths  of  the  rectangular  spatial  filter  would 
resolve  a  pattern  by  different  amounts  depending  upon  pattern  orientation. 
Additionally,  diagonal  pattern  orientations  would  provide  increased  high 
spatial  frequencies  required  for  reducing  the  Gestalt  groupings  that  cor¬ 
relate  with  previously  discussed  psychological  experiments  as  well  as 
large  deviations  from  stored  prototypes  used  to  explain  non-monotonic 
discrimination  of  forms.  The  rectangular  spatial  filter  would  also 
account,  for  the  horizontal-vertical  illusion  disparities  as  well  as  the 
decreased  illusion  effects  when  the  figure  is  tilted  90°. 

Experiments  with  the  Rectangle  Filter 

Introduction.  The  preceding  discussion  illustrated  the  failings  of 
the  square  spatial  filter  in  terms  of  basic  psychological  correlates  and 
suggested  a  rectangular  shaped  spatial  filter.  This  section  concerns  a 
preliminary  investigation  of  the  characteristics  of  a  rectangular  spatial 
filter  in  satisfying  the  previous  psychological  criteria. 

Research  Method .  The  pattern  construction  remained  the  same  as  out¬ 
lined  in  previous  sections.  Low-pass  spatial  filtering  was  accomplished 
with  a  rectangle  (1:0.6)  filter  using  the  same  Fourier  optics  techniques 
previously  used  for  the  Gestalt  figures  and  geometric  illusions. 

Similarly,  the  letters  a,  b,  and  c  will  denote  the  object,  spatially  fil¬ 
tered  transform,  and  reconstructed  images  for  the  following  photographs. 
These  photographs  have  been  arranged  to  show  the  filter  tilted — not  the 
object--to  allow  a  better  comparison  of  the  reconstructed  images  (Figs. 

27  through  30  on  pages  63  and  64). 


61 


(jfc/fct/  /  ib-Z 


Results .  The  effects  of  a  rectangle  300pX240|ilow-pass  spatial 
filter  on  the  horizontal  vertical  illusion  is  illustrated  in  Fig.  27  on 
the  following  page.  The  equal  object  lines  (3/8  in.)  are  now  unequal  in 
the  reconstructed  image.  The  vertical  line  is  l/l6  in.  longer  than  the 
horizontal  line  due  mainly  to  the  less  resolved  horizontal  line.  Figure 
17  also  illustrates  the  illusion  turned  90°  which  resulted  in  making  the 
vertical  line  l/32  in.  longer  than  the  horizontal  line.  Also  note  that 
the  vertical  line  is  brighter  than  the  horizontal  line  in  both  cases, 
again  suggesting  that  brightness  may  play  a  part  in  geometric  illusions. 
The  illusion  tilted  45°  clockwise  and  counter-clockwise  is  illustrated  in 
Fig.  27.  Both  the  horizontal  and  vertical  line  lengths  are  approximately 
equally  resolved  and  the  brightness  distribution  is  more  uniform,  It  was 
noted  that  the  tips  of  the  lines  were  somewhat  distorted.  This  observa¬ 
tion  prompted  an  investigation  with  a  single  line,  illustrated  in  Fig.  29, 
oriented  at  0,  45,  60,  and  90  degrees,  respectively.  Note  the  line  length 
changes  as  well  as  line  tip  distortion.  These  line  tip  distortions  sug¬ 
gest  gross  distortions  for  complex  line  patterns  such  as  letters.  The 
rotated  letter  K  at  0,  45,  and  90  degree  angles,  illustrated  in  Fig.  29, 
demonstrates  gross  distortion  at  45°  with  a  200p  *  120p  rectangle  filter. 
The  effect  of  the  rectangle  filter  on  the  principle  of  similarity  and 
proximity  is  demonstrated  by  Fig.  30,  page  64.  Especially  note  the 
increased  resolution  of  the  dots  of  this  pattern  at  45°  as  well  as  the 
dissimilar  groups  in  the  horizontal  and  vertical  orientations. 

Discussion.  The  results  of  this  investigation  with  the  rectangle 
filter  correlate  well  to  previous  psychological  findings  of  discrimina¬ 
tion  degradation  upon  rotation  as  well  as  the  perceptual  failure  noted  by 
the  horizontal  illusion.  These  results  should  only  be  considered  to 
warrant  further  investigation.  It  was  a  difficult  task  in  keeping  the 
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Fig.  28 

Line  (300  *  180  Rectangle  Filter)  j 

■i 


Fig.  29 

Letter  K  (200  ^  x  120  Rectangle  Filter) 
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O'1  ()0°  45° 

Fig.  30 

Principles  of  Proximity  and  Similarity  with  Dot  and  Line  Pattern 
(300  p  *  180  (i  Rectangle  Filter) 

transform  centered  during  rotation  and  small  changes  in  filter  orientation 
somewhat  disturb  the  reconstructed  image.  However,  these  results  do 
validate  the  concepts  and  suggest  that  the  Kabrisky  model  can  be  extended 
with  a  rectangular  shaped  spatial  filter  for  additional  psychological 
correlates. 

It  is  well  known  that  individuals  from  other  cultures,  e.g.,  the 
Zulu's,  do  not  see  many  of  the  geometric  illusions  that  Western  man 
sees  (Ref  18:160-163).  In  terms  of  the  model,  this  fact  suggests  the 
existence  of  either  an  adaptive  filter  shape  that  is  culturally  con¬ 
ditioned  or  a  selective  attention  mechanism  that  learns  to  "pay  attention 
to"  different  parts  of  the  cortical  transform.  Whereas  the  rectangular 
shaped  spatial  filter  is  mimicing,  at  least  on  a  macroscopic  level,  human 
perceptual  phenomenon,  the  shape  and  size  ratios  of  the  rectangular 
spatial  filter  should  be  considered  tentative  in  lieu  of  a  more  accurate 
mapping  of  the  visual  space. 
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Comments  on  Parallel  Processing 

Although  it  is  not  intended  to  review  the  various  theories  of 
visual  pattern  recognition,  there  is  one  theory  hypothesi zing'  perceptual 
processing  that  greatly  parallels  the  pattern  recognition  scheme  of  the 
model  reported  in  this  paper. 

Neisser,  a  cognitive  psychologist,  asks  a  question  much  like 
Kabrisky,  et  ad. ,  have  asked  in  the  identification  game:  "How  does  the 
subject  know  an  A  when  he  sees  one?"  (Ref  35:102).  Tempered  with  numerous 
psychological  research  studies,  he  suggests  the  following  answer  to  his 
question:  The  A  is  segregated  from  the  other  simultaneously  presented 
figures  by  pre-attentive  processing  mechanisms  that  emphasize  the  "whole" 
rather  than  the  "part"  in  the  figures  that  are  constructed  and  redupli¬ 
cated  in  parallel  throughout  the  input  field.  Neisser  further  suggests 
focal  attention  devoted  to  the  A  which  may  lead  to  internal  verbalization 
or  a  "sequence  of  comparisons  with  stored  records  of  earlier  synthesis  to 
determine  the  proper  classification  for  the  present  stimulus,"  Perhaps 
the  weakest  area  of  Neisser’ s  theory  is  that  he  does  not  elaborate  on 
possible  cortical  mechanisms  to  achieve  focal  attention  and  figural  syn¬ 
thesis.  However,  there  exists  a  striking  parallel  to  his  theory  and  the 
Kabrisky  model,  in  that  the  low-pass  spatial  filtering  of  transformed 
patterns  has  been  demonstrated  to  emphasize  the  whole  rather  than  the 
part  of  the  pattern. 

Another  question  remains  to  be  answered:  How  does  one  pattern  become 
reduplicated  throughout  the  input  field?  Consider  a  Mahaffey-type 
transform  between  cortical  area  18  and  19.  The  Mahaffey  transform  maps 
the  object  at  one  plane  into  many  smaller  parallel  areas  in  the  next  plane 
in  a  Fourier-like  manner.  If  an  assumption  is  made  that  the  input  pattern 
is  spatially  transformed  at  area  18,  then  by  a  Mahaffey-type  transform, 
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area  19  could  contain  different  spatially  filtered  images  from  the  visual 
field  as  well  as  allow  simultaneous  attention  of  the  visual  field  at  one 
cortical  level.  This  cortical  scheme  is  illustrated  by  Fig.  "31.  Thus, 


Cortical  Scheme  of  the  Extended  Model 


the  simultaneous  foveal  acuity  as  well  as  the  gross  features  of  the  para¬ 
foveal  visual  field  would  be  at  the  same  cortical  level  for  a  selective 

I 

attention  mechanism.  Thi9  point  is  important  because  if  the  reader  will 
focus  his  visual  attention  on  the  period  at  the  end  of  this  sentence,  it 
will  be  obvious  that  he  can  report  surrounding  features  as  well  as  the 
period  without  changing  his  external  "look."  Additionally,  the  human  per¬ 
ceptual  system  can  describe  the  Gestalt  dot  figures  in  terms  of  dots  or 
g.roups--again,  the  need  for  simultaneous  different  spatially  filtered 
pattern  transforms  at  one  cortical  level.' 

Consider  the  figure-ground  problem  of  the  candle  stick  illusion  shown 
in  Fig.  32a  on  the  following  page.  In  viewing  this  object  form,  one  may 
perceive  two  faces  or  a  candle  stick,  depending  upon  the  "internal  set" 
that  the  viewer  has  chosen.  If  different  spatially  filtered  images  of 
this  figure  were  available  at  one  cortical  level,  as  illustrated  by 
Fig,  32  cl,  c2,  and  c3,  then  a  selective  attention  mechanism  would  "see" 
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external  observer,  it  would  offer  no  additional  information  to  an  internal 
classification  scheme  as  demonstrated  by  the  Tallman-Radoy  algorithm. 

In  sum,  it  appears  that  the  extension  of  the  Kabrisky  model  with  a 
Mahaf fey-type  transform  yields  a  model  of  the  human  visual  system  that 
correlates  well  with  Neisser's  theory  of  human  pattern  recognition. 


O 
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VII.  Conclusions  and  Recommendations 

An  exploratory  investigation  has  been  undertaken  to  correlate  the 
Kabrisky  model  of  the  human  visual  system,  by  the  Tallman-Radcy  computer 
algorithm  and  a  Fourier  optics  homolog,  to  well-kno'An  psychological 
phenomena--identification  of  defocused  letters  and  rotated  letters, 
Gestalt  principles  of  perception,  and  several  geometric  illusions.  The 
following  conclusions  are  supported  by  the  results  of  this  investigations 

1.  The  Kabrisky  model  of  the  human  visual  system  embodied 
by  the  Tallman-Radoy  computer  algorithm  and  the  Fourier 
optics  homolog  appears  to  be  valid  in  providing 

a  priori  quantitative  psychological  metrics  of  visual 
perception. 

2.  Gestalt  principles  of  proximity,  similarity,  closure, 
and  figure-ground  perception  as  well  as  several  geo¬ 
metric  illusions  can  be  explained  in  terms  of  object/ 
image  feature  size,  spatial  arrangement,  and  spatial- 
filter  bandwidth. 

3.  The  human  perceptual  decision  space  (feature  space) 
is  postulated  to  be  the  image  domain  from  spatially 
filtered  transforms  of  object  forms. 

4.  The  Kabrisky  model  can  be  extended  with  a  rectangular 
spatial  filter  and  a  less  densely  connected  Mahaffey- 
type  transform  for  further  psychological  correlations. 

Therefore,  the  general  methodology  of  investigating  models  of  the  human 
visual  system  against  well-known  psychological  correlates  and,  con¬ 
versely,  using  psychological  correlates  to  extend  models  appears  to  be 
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Appendix  A 

Subject  Data  from  the  Identification  of 
Defocused  Letters  Experiment 

This  appendix  contains  a  table  of  the  mean  number  of  correctly 
identified  letters  per  subject  per  trial  for  the  identification  of 
defocused  letters  experiment. 
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Appendix  B 

Confusion  Matrices  of  Letter  Identification 
Errors  under  Rotation 


Confusion  matrices  of  the  human  identification  errors  from  the 
identification  of  rotated  letters  experiment  at  presentation  angles 
of  0,  30,  60,  90,  120,  and  150  degrees  are  contained  in  this  appendix 
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Identification  of  Rotated  Letters  Experiment  at  60° 
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Confusion  Matrix  frutn 

Identification  of  Rotated  Letters  Experiment  at  150° 
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Appendix  C 

T  »  V-  ^  _  _  J?  r* _ .11  J  -  r*  »  _  x _  f  -  „ 

i  omc  a  w  i  Luc  liuton  ui  a  tat  ilc  a  ui 

Rotated  Alphabet  Letters 

The  Eudlidean  distance  for  each  letter  at  each  angle  referenced 
to  itself  at  0  degrees  from  3*5,  5*5,  7X7,  9*9  square  and  circle 
spatial  filters  is  given  in  this  appendix  in  table  form. 
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Euclidean  Distance  for  Each  Letter  at  Angles  of  30,  60,  90,  120,  and  150  Degrees 
Referenced  to  Zero  Degrees  for  the  3><3  Circle  Filter 
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Euclidean  Distance  for  Each  Letter  at  Angles  of  30,  60,  90,  120,  and  150  Degrees 
Referenced  to  Zero  Degrees  for  the  5*5  Circle  Filter 
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Euclidean  Distance  for  Each  Letter  at  Angles  of  30,  60,  90,  120,  and  150  Degrees 
Referenced  to  Zero  Degrees  for  the  7X7  Circle  Filter 
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Euclidean  Distance  for  Each  Letter  at  Angles  of  30,  60,  90,  120,  and  150  Degrees 
Referenced  to  Zero  Degrees  for  the  9*9  Circle  Filter 
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Euclidean  Distance  for  Each  Letter  at  Angles  of  30,  60,  90,  120,  and  150  Degrees 
Referenced  to  Zero  Degrees  for  the  3X3  Square  Filter 
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Appendix  D 

Graohs  of  Euclidean  Distance  for 
Rotated  Alphabet  Letters 


The  Euclidean  distance  for  each  letter  at  each  angle  referenced 
to  itself  at  0  degrees  from  3*5,  5*5,  7x7,  9*Q  square  and  circle 
spatial  filters  is  given  in  this  appendix  in  graph  form. 
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